10 research outputs found

    Peptide conformational sampling using the Quantum Approximate Optimization Algorithm

    Full text link
    Protein folding -- the problem of predicting the spatial structure of a protein given its sequence of amino-acids -- has attracted considerable research effort in biochemistry in recent decades. In this work, we explore the potential of quantum computing to solve a simplified version of protein folding. More precisely, we numerically investigate the performance of a variational quantum algorithm, the Quantum Approximate Optimization Algorithm (QAOA), in sampling low-energy conformations of short peptides. We start by benchmarking the algorithm on an even simpler problem: sampling self-avoiding walks, which is a necessary condition for a valid protein conformation. Motivated by promising results achieved by QAOA on this problem, we then apply the algorithm to a more complete version of protein folding, including a simplified physical potential. In this case, based on numerical simulations on 20 qubits, we find less promising results: deep quantum circuits are required to achieve accurate results, and the performance of QAOA can be matched by random sampling up to a small overhead. Overall, these results cast serious doubt on the ability of QAOA to address the protein folding problem in the near term, even in an extremely simplified setting. We believe that the approach and conclusions presented in this work could offer valuable methodological insights on how to systematically evaluate variational quantum optimization algorithms on real-world problems beyond protein folding.Comment: 30 pages, 18 figure

    Conformator: A Novel Method for the Generation of Conformer Ensembles

    Get PDF
    Computer-aided drug design methods such as docking, pharmacophore searching, 3D database searching, and the creation of 3D-QSAR models need conformational ensembles to handle the flexibility of small molecules. Here, we present Conformator, an accurate and effective knowledge-based algorithm for generating conformer ensembles. With 99.9% of all test molecules processed, Conformator stands out by its robustness with respect to input formats, molecular geometries, and the handling of macrocycles. With an extended set of rules for sampling torsion angles, a novel algorithm for macrocycle conformer generation, and a new clustering algorithm for the assembly of conformer ensembles, Conformator reaches a median minimum root-mean-square deviation (measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers) of 0.47 Å with no significant difference to the highest-ranked commercial algorithm OMEGA and significantly higher accuracy than seven free algorithms, including the RDKit DG algorithm. Conformator is freely available for noncommercial use and academic research.acceptedVersio

    ProteinsPlus: a web portal for structure analysis of macromolecules

    Get PDF
    With currently more than 126 000 publicly available structures and an increasing growth rate, the Protein Data Bank constitutes a rich data source for structure-driven research in fields like drug discovery, crop science and biotechnology in general. Typical workflows in these areas involve manifold computational tools for the analysis and prediction of molecular functions. Here, we present the ProteinsPlus web server that offers a unified easy-to-use interface to a broad range of tools for the early phase of structure-based molecular modeling. This includes solutions for commonly required pre- processing tasks like structure quality assessment (EDIA), hydrogen placement (Protoss) and the search for alternative conformations (SIENA). Beyond that, it also addresses frequent problems as the generation of 2D-interaction diagrams (PoseView), protein–protein interface classification (HyPPI) as well as automatic pocket detection and druggablity assessment (DoGSiteScorer). The unified ProteinsPlus interface covering all featured approaches provides various facilities for intuitive input and result visualization, case-specific parameterization and download options for further processing. Moreover, its generalized workflow allows the user a quick familiarization with the different tools. ProteinsPlus also stores the calculated results temporarily for future request and thus facilitates convenient result communication and re-access. The server is freely available at http://proteins.plus

    Conformator: A Novel Method for the Generation of Conformer Ensembles

    No full text
    Computer-aided drug design methods such as docking, pharmacophore searching, 3D database searching, and the creation of 3D-QSAR models need conformational ensembles to handle the flexibility of small molecules. Here, we present Conformator, an accurate and effective knowledge-based algorithm for generating conformer ensembles. With 99.9% of all test molecules processed, Conformator stands out by its robustness with respect to input formats, molecular geometries, and the handling of macrocycles. With an extended set of rules for sampling torsion angles, a novel algorithm for macrocycle conformer generation, and a new clustering algorithm for the assembly of conformer ensembles, Conformator reaches a median minimum root-mean-square deviation (measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers) of 0.47 Å with no significant difference to the highest-ranked commercial algorithm OMEGA and significantly higher accuracy than seven free algorithms, including the RDKit DG algorithm. Conformator is freely available for noncommercial use and academic research

    Torsion Library Reloaded: A New Version of Expert-Derived SMARTS Rules for Assessing Conformations of Small Molecules

    No full text
    The Torsion Library contains hundreds of rules for small molecule conformations which have been derived from the Cambridge Structural Database (CSD) and are curated by molecular design experts. The torsion rules are encoded as SMARTS patterns and categorize rotatable bonds via a traffic light coloring scheme. We have systematically revised all torsion rules to better identify highly strained conformations and minimize the number of false alerts for CSD small molecule X-ray structures. For this new release, we added or substantially modified 78 torsion patterns and reviewed all angles and tolerance intervals. The overall number of red alerts for a filtered CSD data set with 130 000 structures was reduced by a factor of 4 compared to the predecessor. This is of clear advantage in 3D virtual screening where hits should only be removed by a conformational filter if they are in energetically inaccessible conformations

    Estimating Electron Density Support for Individual Atoms and Molecular Fragments in X‑ray Structures

    No full text
    Macromolecular structures resolved by X-ray crystallography are essential for life science research. While some methods exist to automatically quantify the quality of the electron density fit, none of them is without flaws. Especially the question of how well individual parts like atoms, small fragments, or molecules are supported by electron density is difficult to quantify. While taking experimental uncertainties correctly into account, they do not offer an answer on how reliable an individual atom position is. A rapid quantification of this atomic position reliability would be highly valuable in structure-based molecular design. To overcome this limitation, we introduce the electron density score EDIA for individual atoms and molecular fragments. EDIA assesses rapidly, automatically, and intuitively the fit of individual as well as multiple atoms (EDIA<sub>m</sub>) into electron density accompanied by an integrated error analysis. The computation is based on the standard 2<i>fo</i> – <i>fc</i> electron density map in combination with the model of the molecular structure. For evaluating partial structures, EDIA<sub>m</sub> shows significant advantages compared to the real-space R correlation coefficient (RSCC) and the real-space difference density Z score (RSZD) from the molecular modeler’s point of view. Thus, EDIA abolishes the time-consuming step of visually inspecting the electron density during structure selection and curation. It supports daily modeling tasks of medicinal and computational chemists and enables a fully automated assembly of large-scale, high-quality structure data sets. Furthermore, EDIA scores can be applied for model validation and method development in computer-aided molecular design. In contrast to measuring the deviation from the structure model by root-mean-squared deviation, EDIA scores allow comparison to the underlying experimental data taking its uncertainty into account

    High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators

    No full text
    We developed a cheminformatics pipeline for the fully automated selection and extraction of high-quality protein-bound ligand conformations from X-ray structural data. The pipeline evaluates the validity and accuracy of the 3D structures of small molecules according to multiple criteria, including their fit to the electron density and their physicochemical and structural properties. Using this approach, we compiled two high-quality datasets from the Protein Data Bank (PDB): a comprehensive dataset and a diversified subset of 4626 and 2912 structures, respectively. The datasets were applied to benchmarking seven freely available conformer ensemble generators: Balloon (two different algorithms), the RDKit standard conformer ensemble generator, the Experimental-Torsion basic Knowledge Distance Geometry (ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK. Substantial differences in the performance of the individual algorithms were observed, with RDKit and ETKDG generally achieving a favorable balance of accuracy, ensemble size and runtime. The Platinum datasets are available for download from http://www.zbh.uni-hamburg.de/platinum_dataset

    Large-Scale Analysis of Hydrogen Bond Interaction Patterns in Protein–Ligand Interfaces

    No full text
    Protein–ligand interactions are the fundamental basis for molecular design in pharmaceutical research, biocatalysis, and agrochemical development. Especially hydrogen bonds are known to have special geometric requirements and therefore deserve a detailed analysis. In modeling approaches a more general description of hydrogen bond geometries, using distance and directionality, is applied. A first study of their geometries was performed based on 15 protein structures in 1982. Currently there are about 95 000 protein–ligand structures available in the PDB, providing a solid foundation for a new large-scale statistical analysis. Here, we report a comprehensive investigation of geometric and functional properties of hydrogen bonds. Out of 22 defined functional groups, eight are fully in accordance with theoretical predictions while 14 show variations from expected values. On the basis of these results, we derived interaction geometries to improve current computational models. It is expected that these observations will be useful in designing new chemical structures for biological applications

    Large-Scale Analysis of Hydrogen Bond Interaction Patterns in Protein–Ligand Interfaces

    No full text
    Protein–ligand interactions are the fundamental basis for molecular design in pharmaceutical research, biocatalysis, and agrochemical development. Especially hydrogen bonds are known to have special geometric requirements and therefore deserve a detailed analysis. In modeling approaches a more general description of hydrogen bond geometries, using distance and directionality, is applied. A first study of their geometries was performed based on 15 protein structures in 1982. Currently there are about 95 000 protein–ligand structures available in the PDB, providing a solid foundation for a new large-scale statistical analysis. Here, we report a comprehensive investigation of geometric and functional properties of hydrogen bonds. Out of 22 defined functional groups, eight are fully in accordance with theoretical predictions while 14 show variations from expected values. On the basis of these results, we derived interaction geometries to improve current computational models. It is expected that these observations will be useful in designing new chemical structures for biological applications
    corecore