December 21, 2021

3D-Scaffold, a Deep Learning Approach to Identify Novel Molecules for Therapeutics

Data science tools for generating a target-focused chemical space can streamline early stages of drug development

3-D Scaffold

Scientists produced valid, unique, and experimentally synthesizable molecules for therapeutic candidates using a desired core 3D structure, or scaffold, critical for targeting two key proteins in SARS-CoV-2.

(Illustration by Nathan Johnson | Pacific Northwest National Laboratory)

The Science                                 

With new pathogens on the horizon, creating new pharmaceuticals to combat them is a pressing need. The discovery of a new therapeutic drug is a long and expensive process that can take many years before clinical approval. To assist in this endeavor, a multi-institutional team has developed a deep learning framework to identify novel molecules as drug candidates. Called 3D-Scaffold, the framework creates 3D coordinates of new molecules around a core structure, or scaffold. These coordinates can be directly tested against a protein target using computational screening. Using only a small amount of training data, this framework produced 3D coordinates of molecules calculated to have high binding affinity against two key proteins in the novel coronavirus, SARS-CoV-2.

The Impact

3D-Scaffold explores a vast amount of chemical space using computer simulation to screen vast libraries of molecular structures for binding to a protein target. With 3D-Scaffold, researchers can generate novel, synthesizable structures that are likely to be effective against their targets by learning and reasoning from existing FDA-approved data sets. 3D-Scaffold is the first generative artificial intelligence model with medicinal chemistry application that can generate 3D coordinates of target-specific therapeutics and a library of activity-based probes with desired scaffolds.


3D-Scaffold is a deep generative model that produces 3D coordinates of novel molecules with desirable biophysical and biochemical properties. This framework preserves critical structural scaffolds during the generation process. First, 3D-Scaffold analyzes the chemical environment around the scaffold. Then it predicts distributions for the next type of atom to add. Finally, it sequentially attaches new atoms around the central scaffold. Molecules generated with 3D-Scaffold were predominantly valid, unique, novel, and synthesizable. They had drug-like properties similar to the molecules in the training set.

Using domain-specific data sets as training sets, scientists generated covalent and non-covalent antiviral inhibitors targeting viral proteins. Then they performed virtual screening via docking simulations. The generated structures interacted favorably against SARS-CoV-2 protein targets. Most importantly, the model performs well with relatively small volumes of training data and generates drug candidates that mimic non-structured peptides with similar motifs, which can covalently bind protein targets. Further training and optimization of this framework can accelerate the identification and optimization of leads in drug discovery and development across a range of therapeutic targets. This research used high-performance computing resources at the Environmental Molecular Sciences Laboratory (EMSL), a DOE Office of Science User Facility located at Pacific Northwest National Laboratory.

PNNL Contact

Neeraj Kumar
Pacific Northwest National Laboratory  


The DOE Office of Science supported this research through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on the response to COVID-19, with funding provided by the Coronavirus CARES Act. The research was performed using capabilities at EMSL, a DOE Office of Science user facility sponsored by the Biological and Environmental Research program.

Published: December 21, 2021

R. P. Joshi, et al. “3D-Scaffold: A Deep Learning Framework to Generate 3D Coordinates of Drug-like Molecules with Desired Scaffolds.” The Journal of Physical Chemistry (2021). [DOI: 10.1021/acs.jpcb.1c06437]