Search CORE

33,486 research outputs found

Recommended from our members

Electrostatic-field and surface-shape similarity for virtual screening and pose prediction.

Author: Cleves Ann E
Jain Ajay N
Johnson Stephen R
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

We introduce a new method for rapid computation of 3D molecular similarity that combines electrostatic field comparison with comparison of molecular surface-shape and directional hydrogen-bonding preferences (called "eSim"). Rather than employing heuristic "colors" or user-defined molecular feature types to represent conformation-dependent molecular electrostatics, eSim calculates the similarity of the electrostatic fields of two molecules (in addition to shape and hydrogen-bonding). We present detailed virtual screening performance data on the standard 102 target DUD-E set. In its moderately fast screening mode, eSim running on a single computing core is capable of processing over 60 molecules per second. In this mode, eSim performed significantly better than all alternate methods for which full DUD-E data were available (mean ROC area of 0.74, p [Formula: see text], by paired t-test, compared with the best performing alternate method). In addition, for 92 targets of the DUD-E set where multiple ligand-bound crystal structures were available, screening performance was assessed using alternate ligands or sets thereof (in their bound poses) as similarity targets. Using the joint alignment of five ligands for each protein target, mean ROC area exceeded 0.82 for the 92 targets. Design-focused application of ligand similarity methods depends on accurate predictions of geometric molecular relationships. We comprehensively assessed pose prediction accuracy by curating nearly 400,000 bound ligand pose pairs across the DUD-E targets. Overall, beginning from agnostic initial poses, we observed an 80% success rate for RMSD [Formula: see text] Å among the top 20 predicted eSim poses. These examples were split roughly 50/50 into cases with high direct atomic overlap (where a shared scaffold exists between a pair) and low direct atomic overlap (where where a ligand pair has dissimilar scaffolds but largely occupies the same space). Within the high direct atomic overlap subset, the pose prediction success rate was 93%. For the more challenging subset (where dissimilar scaffolds are to be aligned), the success rate was 70%. The eSim approach enables both large-scale screening and rational design of ligands and is rooted in physically meaningful, non-heuristic, molecular comparisons

eScholarship - University of California

Coarse-Graining Auto-Encoders for Molecular Dynamics

Author: Gómez-Bombarelli Rafael
Wang Wujie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/03/2019
Field of study

Molecular dynamics simulations provide theoretical insight into the microscopic behavior of materials in condensed phase and, as a predictive tool, enable computational design of new compounds. However, because of the large temporal and spatial scales involved in thermodynamic and kinetic phenomena in materials, atomistic simulations are often computationally unfeasible. Coarse-graining methods allow simulating larger systems, by reducing the dimensionality of the simulation, and propagating longer timesteps, by averaging out fast motions. Coarse-graining involves two coupled learning problems; defining the mapping from an all-atom to a reduced representation, and the parametrization of a Hamiltonian over coarse-grained coordinates. Multiple statistical mechanics approaches have addressed the latter, but the former is generally a hand-tuned process based on chemical intuition. Here we present Autograin, an optimization framework based on auto-encoders to learn both tasks simultaneously. Autograin is trained to learn the optimal mapping between all-atom and reduced representation, using the reconstruction loss to facilitate the learning of coarse-grained variables. In addition, a force-matching method is applied to variationally determine the coarse-grained potential energy function. This procedure is tested on a number of model systems including single-molecule and bulk-phase periodic simulations.Comment: 8 pages, 6 figure

arXiv.org e-Print Archive

DSpace@MIT

Accelerated X-ray Structure Elucidation of a 36 kDa Muramidase/Transglycosylase Using wARP

Author: Asselt Erik J. van,
Dijkstra Bauke W.,
Kalk Kor H.,
Lamzin Victor S.,
Perrakis Anastassis,
Publication venue
Publication date: 01/01/1998
Field of study

The X-ray structure of the 36kDa soluble lytic transglycosylase from Escherichia coli has been determined starting with the multiple isomorphous replacement method with inclusion of anomalous scattering at 2.7 Å resolution. Subsequently, before any model building was carried out, phases were extended to 1.7 Å, resolution with the weighted automated refinement procedure wARP, which gave a dramatic improvement in the phases. The electron-density maps from wARP were of outstanding quality for both the main chain and the side chains of the protein, which allowed the time spent on the tracing, interpretation and building of the X-ray structure to be substantially shortened. The structure of the soluble lyric transglycosylase was refined at 1.7 Å, resolution with X-PLOR to a final crystallographic R factor of 18.9%. Analysis of the wARP procedure revealed that the use of the maximum-likelihood refinement in wARP gave much better phases than least-squares refinement, provided that the ratio of reflections to protein atom parameters was approximately 1.8 or higher. Furthermore, setting aside 5% of the data for an Rfree test set had a negative effect on the phase improvement. The mean WwARP, a weight determined at the end of the wARP procedure and based on the variance of structure factors from six individually refined wARP models, proved to be a better indicator than the Rfree factor to judge different phase improvement protocols. The elongated Slt35 structure has three domains named the alpha, beta and core domains. The alpha domain contains mainly α-helices, while the beta domain consists of a five-stranded antiparallel β-sheet flanked by a short α-helix. Sandwiched between the alpha and beta domains is the core domain, which bears some resemblance to the fold of the catalytic domain of the previously elucidated 70 kDa soluble lytic transglycosylase from E. coli. The putative active site is at the bottom of a large deep groove in the core domain.

DESY Publication Database

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

DESY

University of Groningen Digital Archive

Dissertations of the University of Groningen

Recommended from our members

Complex macrocycle exploration: parallel, heuristic, and constraint-based conformer generation using ForceGen.

Author: Cleves Ann E
Gao Qi
Jain Ajay N
Liu Yizhou
Reibarkh Mikhail Y
Sherer Edward C
Wang Xiao
Publication venue: eScholarship, University of California
Publication date: 01/06/2019
Field of study

ForceGen is a template-free, non-stochastic approach for 2D to 3D structure generation and conformational elaboration for small molecules, including both non-macrocycles and macrocycles. For conformational search of non-macrocycles, ForceGen is both faster and more accurate than the best of all tested methods on a very large, independently curated benchmark of 2859 PDB ligands. In this study, the primary results are on macrocycles, including results for 431 unique examples from four separate benchmarks. These include complex peptide and peptide-like cases that can form networks of internal hydrogen bonds. By making use of new physical movements ("flips" of near-linear sub-cycles and explicit formation of hydrogen bonds), ForceGen exhibited statistically significantly better performance for overall RMS deviation from experimental coordinates than all other approaches. The algorithmic approach offers natural parallelization across multiple computing-cores. On a modest multi-core workstation, for all but the most complex macrocycles, median wall-clock times were generally under a minute in fast search mode and under 2 min using thorough search. On the most complex cases (roughly cyclic decapeptides and larger) explicit exploration of likely hydrogen bonding networks yielded marked improvements, but with calculation times increasing to several minutes and in some cases to roughly an hour for fast search. In complex cases, utilization of NMR data to constrain conformational search produces accurate conformational ensembles representative of solution state macrocycle behavior. On macrocycles of typical complexity (up to 21 rotatable macrocyclic and exocyclic bonds), design-focused macrocycle optimization can be practically supported by computational chemistry at interactive time-scales, with conformational ensemble accuracy equaling what is seen with non-macrocyclic ligands. For more complex macrocycles, inclusion of sparse biophysical data is a helpful adjunct to computation

eScholarship - University of California

Structure identification methods for atomistic simulations of crystalline materials

Author: Stukowski Alexander
Publication venue: 'IOP Publishing'
Publication date: 11/06/2012
Field of study

We discuss existing and new computational analysis techniques to classify local atomic arrangements in large-scale atomistic computer simulations of crystalline solids. This article includes a performance comparison of typical analysis algorithms such as Common Neighbor Analysis, Centrosymmetry Analysis, Bond Angle Analysis, Bond Order Analysis, and Voronoi Analysis. In addition we propose a simple extension to the Common Neighbor Analysis method that makes it suitable for multi-phase systems. Finally, we introduce a new structure identification algorithm, the Neighbor Distance Analysis, that is designed to identify atomic structure units in grain boundaries

arXiv.org e-Print Archive

Crossref