10 research outputs found
Peptide conformational sampling using the Quantum Approximate Optimization Algorithm
Protein folding -- the problem of predicting the spatial structure of a
protein given its sequence of amino-acids -- has attracted considerable
research effort in biochemistry in recent decades. In this work, we explore the
potential of quantum computing to solve a simplified version of protein
folding. More precisely, we numerically investigate the performance of a
variational quantum algorithm, the Quantum Approximate Optimization Algorithm
(QAOA), in sampling low-energy conformations of short peptides. We start by
benchmarking the algorithm on an even simpler problem: sampling self-avoiding
walks, which is a necessary condition for a valid protein conformation.
Motivated by promising results achieved by QAOA on this problem, we then apply
the algorithm to a more complete version of protein folding, including a
simplified physical potential. In this case, based on numerical simulations on
20 qubits, we find less promising results: deep quantum circuits are required
to achieve accurate results, and the performance of QAOA can be matched by
random sampling up to a small overhead. Overall, these results cast serious
doubt on the ability of QAOA to address the protein folding problem in the near
term, even in an extremely simplified setting. We believe that the approach and
conclusions presented in this work could offer valuable methodological insights
on how to systematically evaluate variational quantum optimization algorithms
on real-world problems beyond protein folding.Comment: 30 pages, 18 figure
Conformator: A Novel Method for the Generation of Conformer Ensembles
Computer-aided drug design methods such as docking, pharmacophore searching, 3D database searching, and the creation of 3D-QSAR models need conformational ensembles to handle the flexibility of small molecules. Here, we present Conformator, an accurate and effective knowledge-based algorithm for generating conformer ensembles. With 99.9% of all test molecules processed, Conformator stands out by its robustness with respect to input formats, molecular geometries, and the handling of macrocycles. With an extended set of rules for sampling torsion angles, a novel algorithm for macrocycle conformer generation, and a new clustering algorithm for the assembly of conformer ensembles, Conformator reaches a median minimum root-mean-square deviation (measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers) of 0.47 Ă
with no significant difference to the highest-ranked commercial algorithm OMEGA and significantly higher accuracy than seven free algorithms, including the RDKit DG algorithm. Conformator is freely available for noncommercial use and academic research.acceptedVersio
ProteinsPlus: a web portal for structure analysis of macromolecules
With currently more than 126 000 publicly available structures and an
increasing growth rate, the Protein Data Bank constitutes a rich data source
for structure-driven research in fields like drug discovery, crop science and
biotechnology in general. Typical workflows in these areas involve manifold
computational tools for the analysis and prediction of molecular functions.
Here, we present the ProteinsPlus web server that offers a unified easy-to-use
interface to a broad range of tools for the early phase of structure-based
molecular modeling. This includes solutions for commonly required pre-
processing tasks like structure quality assessment (EDIA), hydrogen placement
(Protoss) and the search for alternative conformations (SIENA). Beyond that,
it also addresses frequent problems as the generation of 2D-interaction
diagrams (PoseView), proteinâprotein interface classification (HyPPI) as well
as automatic pocket detection and druggablity assessment (DoGSiteScorer). The
unified ProteinsPlus interface covering all featured approaches provides
various facilities for intuitive input and result visualization, case-specific
parameterization and download options for further processing. Moreover, its
generalized workflow allows the user a quick familiarization with the
different tools. ProteinsPlus also stores the calculated results temporarily
for future request and thus facilitates convenient result communication and
re-access. The server is freely available at http://proteins.plus
Conformator: A Novel Method for the Generation of Conformer Ensembles
Computer-aided drug design methods such as docking, pharmacophore searching, 3D database searching, and the creation of 3D-QSAR models need conformational ensembles to handle the flexibility of small molecules. Here, we present Conformator, an accurate and effective knowledge-based algorithm for generating conformer ensembles. With 99.9% of all test molecules processed, Conformator stands out by its robustness with respect to input formats, molecular geometries, and the handling of macrocycles. With an extended set of rules for sampling torsion angles, a novel algorithm for macrocycle conformer generation, and a new clustering algorithm for the assembly of conformer ensembles, Conformator reaches a median minimum root-mean-square deviation (measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers) of 0.47 Ă
with no significant difference to the highest-ranked commercial algorithm OMEGA and significantly higher accuracy than seven free algorithms, including the RDKit DG algorithm. Conformator is freely available for noncommercial use and academic research
Torsion Library Reloaded: A New Version of Expert-Derived SMARTS Rules for Assessing Conformations of Small Molecules
The
Torsion Library contains hundreds of rules for small molecule
conformations which have been derived from the Cambridge Structural
Database (CSD) and are curated by molecular design experts. The torsion
rules are encoded as SMARTS patterns and categorize rotatable bonds
via a traffic light coloring scheme. We have systematically revised
all torsion rules to better identify highly strained conformations
and minimize the number of false alerts for CSD small molecule X-ray
structures. For this new release, we added or substantially modified
78 torsion patterns and reviewed all angles and tolerance intervals.
The overall number of red alerts for a filtered CSD data set with
130âŻ000 structures was reduced by a factor of 4 compared to
the predecessor. This is of clear advantage in 3D virtual screening
where hits should only be removed by a conformational filter if they
are in energetically inaccessible conformations
Estimating Electron Density Support for Individual Atoms and Molecular Fragments in Xâray Structures
Macromolecular structures
resolved by X-ray crystallography are
essential for life science research. While some methods exist to automatically
quantify the quality of the electron density fit, none of them is
without flaws. Especially the question of how well individual parts
like atoms, small fragments, or molecules are supported by electron
density is difficult to quantify. While taking experimental uncertainties
correctly into account, they do not offer an answer on how reliable
an individual atom position is. A rapid quantification of this atomic
position reliability would be highly valuable in structure-based molecular
design. To overcome this limitation, we introduce the electron density
score EDIA for individual atoms and molecular fragments. EDIA assesses
rapidly, automatically, and intuitively the fit of individual as well
as multiple atoms (EDIA<sub>m</sub>) into electron density accompanied
by an integrated error analysis. The computation is based on the standard
2<i>fo</i> â <i>fc</i> electron density
map in combination with the model of the molecular structure. For
evaluating partial structures, EDIA<sub>m</sub> shows significant
advantages compared to the real-space R correlation coefficient (RSCC)
and the real-space difference density Z score (RSZD) from the molecular
modelerâs point of view. Thus, EDIA abolishes the time-consuming
step of visually inspecting the electron density during structure
selection and curation. It supports daily modeling tasks of medicinal
and computational chemists and enables a fully automated assembly
of large-scale, high-quality structure data sets. Furthermore, EDIA
scores can be applied for model validation and method development
in computer-aided molecular design. In contrast to measuring the deviation
from the structure model by root-mean-squared deviation, EDIA scores
allow comparison to the underlying experimental data taking its uncertainty
into account
High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators
We
developed a cheminformatics pipeline for the fully automated
selection and extraction of high-quality protein-bound ligand conformations
from X-ray structural data. The pipeline evaluates the validity and
accuracy of the 3D structures of small molecules according to multiple
criteria, including their fit to the electron density and their physicochemical
and structural properties. Using this approach, we compiled two high-quality
datasets from the Protein Data Bank (PDB): a comprehensive dataset
and a diversified subset of 4626 and 2912 structures, respectively.
The datasets were applied to benchmarking seven freely available conformer
ensemble generators: Balloon (two different algorithms), the RDKit
standard conformer ensemble generator, the Experimental-Torsion basic
Knowledge Distance Geometry (ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK.
Substantial differences in the performance of the individual algorithms
were observed, with RDKit and ETKDG generally achieving a favorable
balance of accuracy, ensemble size and runtime. The Platinum datasets
are available for download from http://www.zbh.uni-hamburg.de/platinum_dataset
Large-Scale Analysis of Hydrogen Bond Interaction Patterns in ProteinâLigand Interfaces
Proteinâligand
interactions are the fundamental basis for
molecular design in pharmaceutical research, biocatalysis, and agrochemical
development. Especially hydrogen bonds are known to have special geometric
requirements and therefore deserve a detailed analysis. In modeling
approaches a more general description of hydrogen bond geometries,
using distance and directionality, is applied. A first study of their
geometries was performed based on 15 protein structures in 1982. Currently
there are about 95âŻ000 proteinâligand structures available
in the PDB, providing a solid foundation for a new large-scale statistical
analysis. Here, we report a comprehensive investigation of geometric
and functional properties of hydrogen bonds. Out of 22 defined functional
groups, eight are fully in accordance with theoretical predictions
while 14 show variations from expected values. On the basis of these
results, we derived interaction geometries to improve current computational
models. It is expected that these observations will be useful in designing
new chemical structures for biological applications
Large-Scale Analysis of Hydrogen Bond Interaction Patterns in ProteinâLigand Interfaces
Proteinâligand
interactions are the fundamental basis for
molecular design in pharmaceutical research, biocatalysis, and agrochemical
development. Especially hydrogen bonds are known to have special geometric
requirements and therefore deserve a detailed analysis. In modeling
approaches a more general description of hydrogen bond geometries,
using distance and directionality, is applied. A first study of their
geometries was performed based on 15 protein structures in 1982. Currently
there are about 95âŻ000 proteinâligand structures available
in the PDB, providing a solid foundation for a new large-scale statistical
analysis. Here, we report a comprehensive investigation of geometric
and functional properties of hydrogen bonds. Out of 22 defined functional
groups, eight are fully in accordance with theoretical predictions
while 14 show variations from expected values. On the basis of these
results, we derived interaction geometries to improve current computational
models. It is expected that these observations will be useful in designing
new chemical structures for biological applications