33,486 research outputs found
Recommended from our members
Electrostatic-field and surface-shape similarity for virtual screening and pose prediction.
We introduce a new method for rapid computation of 3D molecular similarity that combines electrostatic field comparison with comparison of molecular surface-shape and directional hydrogen-bonding preferences (called "eSim"). Rather than employing heuristic "colors" or user-defined molecular feature types to represent conformation-dependent molecular electrostatics, eSim calculates the similarity of the electrostatic fields of two molecules (in addition to shape and hydrogen-bonding). We present detailed virtual screening performance data on the standard 102 target DUD-E set. In its moderately fast screening mode, eSim running on a single computing core is capable of processing over 60 molecules per second. In this mode, eSim performed significantly better than all alternate methods for which full DUD-E data were available (mean ROC area of 0.74, p [Formula: see text], by paired t-test, compared with the best performing alternate method). In addition, for 92 targets of the DUD-E set where multiple ligand-bound crystal structures were available, screening performance was assessed using alternate ligands or sets thereof (in their bound poses) as similarity targets. Using the joint alignment of five ligands for each protein target, mean ROC area exceeded 0.82 for the 92 targets. Design-focused application of ligand similarity methods depends on accurate predictions of geometric molecular relationships. We comprehensively assessed pose prediction accuracy by curating nearly 400,000 bound ligand pose pairs across the DUD-E targets. Overall, beginning from agnostic initial poses, we observed an 80% success rate for RMSD [Formula: see text] Ã…Â among the top 20 predicted eSim poses. These examples were split roughly 50/50 into cases with high direct atomic overlap (where a shared scaffold exists between a pair) and low direct atomic overlap (where where a ligand pair has dissimilar scaffolds but largely occupies the same space). Within the high direct atomic overlap subset, the pose prediction success rate was 93%. For the more challenging subset (where dissimilar scaffolds are to be aligned), the success rate was 70%. The eSim approach enables both large-scale screening and rational design of ligands and is rooted in physically meaningful, non-heuristic, molecular comparisons
Coarse-Graining Auto-Encoders for Molecular Dynamics
Molecular dynamics simulations provide theoretical insight into the
microscopic behavior of materials in condensed phase and, as a predictive tool,
enable computational design of new compounds. However, because of the large
temporal and spatial scales involved in thermodynamic and kinetic phenomena in
materials, atomistic simulations are often computationally unfeasible.
Coarse-graining methods allow simulating larger systems, by reducing the
dimensionality of the simulation, and propagating longer timesteps, by
averaging out fast motions. Coarse-graining involves two coupled learning
problems; defining the mapping from an all-atom to a reduced representation,
and the parametrization of a Hamiltonian over coarse-grained coordinates.
Multiple statistical mechanics approaches have addressed the latter, but the
former is generally a hand-tuned process based on chemical intuition. Here we
present Autograin, an optimization framework based on auto-encoders to learn
both tasks simultaneously. Autograin is trained to learn the optimal mapping
between all-atom and reduced representation, using the reconstruction loss to
facilitate the learning of coarse-grained variables. In addition, a
force-matching method is applied to variationally determine the coarse-grained
potential energy function. This procedure is tested on a number of model
systems including single-molecule and bulk-phase periodic simulations.Comment: 8 pages, 6 figure
Accelerated X-ray Structure Elucidation of a 36 kDa Muramidase/Transglycosylase Using wARP
The X-ray structure of the 36kDa soluble lytic transglycosylase from Escherichia coli has been determined starting with the multiple isomorphous replacement method with inclusion of anomalous scattering at 2.7 Å resolution. Subsequently, before any model building was carried out, phases were extended to 1.7 Å, resolution with the weighted automated refinement procedure wARP, which gave a dramatic improvement in the phases. The electron-density maps from wARP were of outstanding quality for both the main chain and the side chains of the protein, which allowed the time spent on the tracing, interpretation and building of the X-ray structure to be substantially shortened. The structure of the soluble lyric transglycosylase was refined at 1.7 Å, resolution with X-PLOR to a final crystallographic R factor of 18.9%. Analysis of the wARP procedure revealed that the use of the maximum-likelihood refinement in wARP gave much better phases than least-squares refinement, provided that the ratio of reflections to protein atom parameters was approximately 1.8 or higher. Furthermore, setting aside 5% of the data for an Rfree test set had a negative effect on the phase improvement. The mean WwARP, a weight determined at the end of the wARP procedure and based on the variance of structure factors from six individually refined wARP models, proved to be a better indicator than the Rfree factor to judge different phase improvement protocols. The elongated Slt35 structure has three domains named the alpha, beta and core domains. The alpha domain contains mainly α-helices, while the beta domain consists of a five-stranded antiparallel β-sheet flanked by a short α-helix. Sandwiched between the alpha and beta domains is the core domain, which bears some resemblance to the fold of the catalytic domain of the previously elucidated 70 kDa soluble lytic transglycosylase from E. coli. The putative active site is at the bottom of a large deep groove in the core domain.
Recommended from our members
Complex macrocycle exploration: parallel, heuristic, and constraint-based conformer generation using ForceGen.
ForceGen is a template-free, non-stochastic approach for 2D to 3D structure generation and conformational elaboration for small molecules, including both non-macrocycles and macrocycles. For conformational search of non-macrocycles, ForceGen is both faster and more accurate than the best of all tested methods on a very large, independently curated benchmark of 2859 PDB ligands. In this study, the primary results are on macrocycles, including results for 431 unique examples from four separate benchmarks. These include complex peptide and peptide-like cases that can form networks of internal hydrogen bonds. By making use of new physical movements ("flips" of near-linear sub-cycles and explicit formation of hydrogen bonds), ForceGen exhibited statistically significantly better performance for overall RMS deviation from experimental coordinates than all other approaches. The algorithmic approach offers natural parallelization across multiple computing-cores. On a modest multi-core workstation, for all but the most complex macrocycles, median wall-clock times were generally under a minute in fast search mode and under 2 min using thorough search. On the most complex cases (roughly cyclic decapeptides and larger) explicit exploration of likely hydrogen bonding networks yielded marked improvements, but with calculation times increasing to several minutes and in some cases to roughly an hour for fast search. In complex cases, utilization of NMR data to constrain conformational search produces accurate conformational ensembles representative of solution state macrocycle behavior. On macrocycles of typical complexity (up to 21 rotatable macrocyclic and exocyclic bonds), design-focused macrocycle optimization can be practically supported by computational chemistry at interactive time-scales, with conformational ensemble accuracy equaling what is seen with non-macrocyclic ligands. For more complex macrocycles, inclusion of sparse biophysical data is a helpful adjunct to computation
Structure identification methods for atomistic simulations of crystalline materials
We discuss existing and new computational analysis techniques to classify
local atomic arrangements in large-scale atomistic computer simulations of
crystalline solids. This article includes a performance comparison of typical
analysis algorithms such as Common Neighbor Analysis, Centrosymmetry Analysis,
Bond Angle Analysis, Bond Order Analysis, and Voronoi Analysis. In addition we
propose a simple extension to the Common Neighbor Analysis method that makes it
suitable for multi-phase systems. Finally, we introduce a new structure
identification algorithm, the Neighbor Distance Analysis, that is designed to
identify atomic structure units in grain boundaries
- …