31 research outputs found
Large-Scale Docking in the Cloud
Molecular docking is a pragmatic approach to exploit
protein structures
for new ligand discovery, but the growing size of available chemical
space is increasingly challenging to screen on in-house computer clusters.
We have therefore developed AWS-DOCK, a protocol for running UCSF
DOCK in the AWS cloud. Our approach leverages the low cost and scalability
of cloud resources combined with a low-molecule-cost docking engine
to screen billions of molecules efficiently. We benchmarked our system
by screening 50 million HAC 22 molecules against the DRD4 receptor
with an average CPU time of around 1 s per molecule. We saw up to
3-fold variations in cost between AWS availability zones. Docking
4.5 billion lead-like molecules, a 7 week calculation on our 1000-core
lab cluster, runs in about a week depending on accessible CPUs, in
AWS for around $25,000, less than the cost of two new nodes. The cloud
docking protocol is described in easy-to-follow steps and may be sufficiently
general to be used for other docking programs. All the tools to enable
AWS-DOCK are available free to everyone, while DOCK 3.8 is free for
academic research
Large-Scale Docking in the Cloud
Molecular docking is a pragmatic approach to exploit
protein structures
for new ligand discovery, but the growing size of available chemical
space is increasingly challenging to screen on in-house computer clusters.
We have therefore developed AWS-DOCK, a protocol for running UCSF
DOCK in the AWS cloud. Our approach leverages the low cost and scalability
of cloud resources combined with a low-molecule-cost docking engine
to screen billions of molecules efficiently. We benchmarked our system
by screening 50 million HAC 22 molecules against the DRD4 receptor
with an average CPU time of around 1 s per molecule. We saw up to
3-fold variations in cost between AWS availability zones. Docking
4.5 billion lead-like molecules, a 7 week calculation on our 1000-core
lab cluster, runs in about a week depending on accessible CPUs, in
AWS for around $25,000, less than the cost of two new nodes. The cloud
docking protocol is described in easy-to-follow steps and may be sufficiently
general to be used for other docking programs. All the tools to enable
AWS-DOCK are available free to everyone, while DOCK 3.8 is free for
academic research
Large-Scale Docking in the Cloud
Molecular docking is a pragmatic approach to exploit
protein structures
for new ligand discovery, but the growing size of available chemical
space is increasingly challenging to screen on in-house computer clusters.
We have therefore developed AWS-DOCK, a protocol for running UCSF
DOCK in the AWS cloud. Our approach leverages the low cost and scalability
of cloud resources combined with a low-molecule-cost docking engine
to screen billions of molecules efficiently. We benchmarked our system
by screening 50 million HAC 22 molecules against the DRD4 receptor
with an average CPU time of around 1 s per molecule. We saw up to
3-fold variations in cost between AWS availability zones. Docking
4.5 billion lead-like molecules, a 7 week calculation on our 1000-core
lab cluster, runs in about a week depending on accessible CPUs, in
AWS for around $25,000, less than the cost of two new nodes. The cloud
docking protocol is described in easy-to-follow steps and may be sufficiently
general to be used for other docking programs. All the tools to enable
AWS-DOCK are available free to everyone, while DOCK 3.8 is free for
academic research
Large-Scale Docking in the Cloud
Molecular docking is a pragmatic approach to exploit
protein structures
for new ligand discovery, but the growing size of available chemical
space is increasingly challenging to screen on in-house computer clusters.
We have therefore developed AWS-DOCK, a protocol for running UCSF
DOCK in the AWS cloud. Our approach leverages the low cost and scalability
of cloud resources combined with a low-molecule-cost docking engine
to screen billions of molecules efficiently. We benchmarked our system
by screening 50 million HAC 22 molecules against the DRD4 receptor
with an average CPU time of around 1 s per molecule. We saw up to
3-fold variations in cost between AWS availability zones. Docking
4.5 billion lead-like molecules, a 7 week calculation on our 1000-core
lab cluster, runs in about a week depending on accessible CPUs, in
AWS for around $25,000, less than the cost of two new nodes. The cloud
docking protocol is described in easy-to-follow steps and may be sufficiently
general to be used for other docking programs. All the tools to enable
AWS-DOCK are available free to everyone, while DOCK 3.8 is free for
academic research
Large-Scale Docking in the Cloud
Molecular docking is a pragmatic approach to exploit
protein structures
for new ligand discovery, but the growing size of available chemical
space is increasingly challenging to screen on in-house computer clusters.
We have therefore developed AWS-DOCK, a protocol for running UCSF
DOCK in the AWS cloud. Our approach leverages the low cost and scalability
of cloud resources combined with a low-molecule-cost docking engine
to screen billions of molecules efficiently. We benchmarked our system
by screening 50 million HAC 22 molecules against the DRD4 receptor
with an average CPU time of around 1 s per molecule. We saw up to
3-fold variations in cost between AWS availability zones. Docking
4.5 billion lead-like molecules, a 7 week calculation on our 1000-core
lab cluster, runs in about a week depending on accessible CPUs, in
AWS for around $25,000, less than the cost of two new nodes. The cloud
docking protocol is described in easy-to-follow steps and may be sufficiently
general to be used for other docking programs. All the tools to enable
AWS-DOCK are available free to everyone, while DOCK 3.8 is free for
academic research
Large-Scale Docking in the Cloud
Molecular docking is a pragmatic approach to exploit
protein structures
for new ligand discovery, but the growing size of available chemical
space is increasingly challenging to screen on in-house computer clusters.
We have therefore developed AWS-DOCK, a protocol for running UCSF
DOCK in the AWS cloud. Our approach leverages the low cost and scalability
of cloud resources combined with a low-molecule-cost docking engine
to screen billions of molecules efficiently. We benchmarked our system
by screening 50 million HAC 22 molecules against the DRD4 receptor
with an average CPU time of around 1 s per molecule. We saw up to
3-fold variations in cost between AWS availability zones. Docking
4.5 billion lead-like molecules, a 7 week calculation on our 1000-core
lab cluster, runs in about a week depending on accessible CPUs, in
AWS for around $25,000, less than the cost of two new nodes. The cloud
docking protocol is described in easy-to-follow steps and may be sufficiently
general to be used for other docking programs. All the tools to enable
AWS-DOCK are available free to everyone, while DOCK 3.8 is free for
academic research
DockOpt: A Tool for Automatic Optimization of Docking Models
Molecular docking is a widely used technique for leveraging
protein
structure for ligand discovery, but it remains difficult to utilize
due to limitations that have not been adequately addressed. Despite
some progress toward automation, docking still requires expert guidance,
hindering its adoption by a broader range of investigators. To make
docking more accessible, we developed a new utility called DockOpt,
which automates the creation, evaluation, and optimization of docking
models prior to their deployment in large-scale prospective screens.
DockOpt outperforms our previous automated pipeline across all 43
targets in the DUDE-Z benchmark data set, and the generated models
for 84% of targets demonstrate sufficient enrichment to warrant their
use in prospective screens, with normalized LogAUC values of at least
15%. DockOpt is available as part of the Python package Pydock3 included
in the UCSF DOCK 3.8 distribution, which is available for free to
academic researchers at https://dock.compbio.ucsf.edu and free for everyone upon registration
at https://tldr.docking.org
Recommended from our members
Predicted Biological Activity of Purchasable Chemical Space
Whereas
400 million distinct compounds are now purchasable within
the span of a few weeks, the biological activities of most are unknown.
To facilitate access to new chemistry for biology, we have combined
the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity
to the nearest bioactive to predict activity for every commercially
available molecule in ZINC. This method, which we label SEA+TC, outperforms
both SEA and a naïve-Bayesian classifier via predictive performance
on a 5-fold cross-validation of ChEMBL’s bioactivity data set
(version 21). Using this method, predictions for over 40% of compounds
(>160 million) have either high significance (pSEA ≥ 40),
high
similarity (ECFP4MaxTc ≥ 0.4), or both, for one or more of
1382 targets well described by ligands in the literature. Using a
further 1347 less-well-described targets, we predict activities for
an additional 11 million compounds. To gauge whether these predictions
are sensible, we investigate 75 predictions for 50 drugs lacking a
binding affinity annotation in ChEMBL. The 535 million predictions
for over 171 million compounds at 2629 targets are linked to purchasing
information and evidence to support each prediction and are freely
available via https://zinc15.docking.org and https://files.docking.org
Enrichment changes with varying degrees of orientational sampling.
<p>A) The histogram of changes between match goals of 50 and 20000 over all 102 DUD-E systems is shown. B) At right, the histogram of which of the five match goal levels produced the best enrichment for each of the 102 DUD-E targets. For each enrichment produced by another match gal, the histogram of the differences is shown to the left.</p
Orientational Matching Diagram.
<p>A toy example illustrating the matching sphere orientational matching algorithm. A) Toy receptor with 4 matching spheres shown as circles and a toy ligand with 3 spheres shown as stars. B) The distance matrices constructed from these spheres are show in the upper right. C) The 2 possible orientational matches of the ligand spheres (as stars) onto the receptor spheres with a distance tolerance of 0.1 (assuming 3 matching nodes are used, in 3D this is usually 4). D) The additional two orientations produced when the distance tolerance is raised to 0.2.</p