23 research outputs found

    Non-equilibrium approach for binding free energies in cyclodextrins in SAMPL7: force fields and software

    Get PDF
    In the current work we report on our participation in the SAMPL7 challenge calculating absolute free energies of the host–guest systems, where 2 guest molecules were probed against 9 hosts-cyclodextrin and its derivatives. Our submission was based on the non-equilibrium free energy calculation protocol utilizing an averaged consensus result from two force fields (GAFF and CGenFF). The submitted prediction achieved accuracy of 1.38kcal/mol in terms of the unsigned error averaged over the whole dataset. Subsequently, we further report on the underlying reasons for discrepancies between our calculations and another submission to the SAMPL7 challenge which employed a similar methodology, but disparate ligand and water force fields. As a result we have uncovered a number of issues in the dihedral parameter definition of the GAFF 2 force field. In addition, we identified particular cases in the molecular topologies where different software packages had a different interpretation of the same force field. This latter observation might be of particular relevance for systematic comparisons of molecular simulation software packages. The aforementioned factors have an influence on the final free energy estimates and need to be considered when performing alchemical calculations

    New molecular simulation methods for quantitative modelling of protein-ligand interactions

    Get PDF
    The main theme of this work is the design and development of new molecular simulation protocols, to achieve more accurate and reliable estimates of free energy changes for processes relevant to the structure-based drug design. The works starts with an insight into the reproducibility problem for alchemical free energy calculations. Even if simulations are run with similar input files, the use of different simulation engines could give different free energy results. As part of a collaborative effort, the implementation details of AMBER, GROMACS, SOMD and CHARMM simulation codes were studied and free energy protocols for each software were validated to converge towards a reproducibility limit of about 0.20 kcal.mol-1 for hydration free energies of small organic molecules. Following, new simulation methods for the estimation of lipophilicity coefficients (log P and log D) for drug like molecules were developed and validated. log P values were computed for a dataset of 5 molecules with increasing fluorination level. Predictions were in line with the experimental measures and the simulations also allowed new insights into the water-solute interactions that drive the partitioning process. Then, as part of the SAMPL5 challenge, log D values for 53 drug-like molecules were computed. In this context two different simulation models were derived in order to take into account the presence of protonated species. The results were encouraging but also highlighted limits in alchemical free energy modelling. As an additional task of the SAMPL5 contest, three different protocols were validated for predicting absolute binding affinities for 22 host-guest systems. The first model yielded a free energy of binding based on free energy changes in solvated and complex phase; the second added the long range dispersion correction to the previous model; the third one used a standard state correction term. All three protocols were among the top-ranked submission in SAMPL5, with a correlation coefficient R2 of about 0.7 against experimental data. Finally, the origins and magnitude of the finite size artefacts in alchemical free energy calculations were investigated. Finite size artefacts are especially predominant in calculations that involve changes in the net-charge of a solute. A new correction scheme was devised for the Barker Watts Reaction Field approach and compared with the literature. Hydration free energy calculations on simple ionic species were carried out to validate the consistency of the scheme and the approach was further extended to host-guest binding affinities predictions

    Protein-Ligand Complex Generator & Drug Screening via Tiered Tensor Transform

    Full text link
    The generation of small molecule candidate (ligand) binding poses in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. Furthermore, we demonstrate that 3T can be used to explore distant protein-ligand binding poses within the protein pocket. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible

    Sharing data from molecular simulations

    Get PDF
    Given the need for modern researchers to produce open, reproducible scientific output, the lack of standards and best practices for sharing data and workflows used to produce and analyze molecular dynamics (MD) simulations has become an important issue in the field. There are now multiple well-established packages to perform molecular dynamics simulations, often highly tuned for exploiting specific classes of hardware, each with strong communities surrounding them, but with very limited interoperability/transferability options. Thus, the choice of the software package often dictates the workflow for both simulation production and analysis. The level of detail in documenting the workflows and analysis code varies greatly in published work, hindering reproducibility of the reported results and the ability for other researchers to build on these studies. An increasing number of researchers are motivated to make their data available, but many challenges remain in order to effectively share and reuse simulation data. To discuss these and other issues related to best practices in the field in general, we organized a workshop in November 2018 (https://bioexcel.eu/events/workshop-on-sharing-data-from-molecular-simulations/). Here, we present a brief overview of this workshop and topics discussed. We hope this effort will spark further conversation in the MD community to pave the way toward more open, interoperable, and reproducible outputs coming from research studies using MD simulations

    Reproducibility of Free Energy Calculations Across Different Molecular Simulation Software

    Get PDF
    <div> <div> <div> <p>Alchemical free energy calculations are an increasingly important modern simulation technique. Contemporary molecular simulation software such as AMBER, CHARMM, GROMACS and SOMD include support for the method. Implementation details vary among those codes but users expect reliability and reproducibility, i.e. for a given molec- ular model and set of forcefield parameters, comparable free energy should be obtained within statistical bounds regardless of the code used. Relative alchemical free energy (RAFE) simulation is increasingly used to support molecule discovery projects, yet the reproducibility of the methodology has been less well tested than its absolute counter- part. Here we present RAFE calculations of hydration free energies for a set of small organic molecules and demonstrate that free energies can be reproduced to within about 0.2 kcal/mol with aforementioned codes. Achieving this level of reproducibility requires considerable attention to detail and package–specific simulation protocols, and no uni- versally applicable protocol emerges. The benchmarks and protocols reported here should be useful for the community to validate new and future versions of software for free energy calculations.</p></div></div></div

    Systematic Finite-Temperature Reduction of Crystal Energy Landscapes

    Get PDF
    Crystal structure prediction methods are prone to overestimate the number of potential polymorphs of organic molecules. In this work, we aim to reduce the overprediction by systematically applying molecular dynamics simulations and biased sampling methods to cluster subsets of structures that can easily interconvert at finite temperature and pressure. Following this approach, we rationally reduce the number of predicted putative polymorphs in crystal structure prediction (CSP)-generated crystal energy landscapes. This uses an unsupervised clustering approach to analyze independent finite-temperature molecular dynamics trajectories and hence identify a representative structure of each cluster of distinct lattice energy minima that are effectively equivalent at finite temperature and pressure. Biased simulations are used to reduce the impact of limited sampling time and to estimate the work associated with polymorphic transformations. We demonstrate the proposed systematic approach by studying the polymorphs of urea and succinic acid, reducing an initial set of over 100 energetically plausible CSP structures to 12 and 27 respectively, including the experimentally known polymorphs. The simulations also indicate the types of disorder and stacking errors that may occur in real structures

    The structural role of SARS-CoV-2 genetic background in the emergence and success of spike mutations: The case of the spike A222V mutation

    Get PDF
    The S:A222V point mutation, within the G clade, was characteristic of the 20E (EU1) SARS-CoV-2 variant identified in Spain in early summer 2020. This mutation has since reappeared in the Delta subvariant AY.4.2, raising questions about its specific effect on viral infection. We report combined serological, functional, structural and computational studies characterizing the impact of this mutation. Our results reveal that S:A222V promotes an increased RBD opening and slightly increases ACE2 binding as compared to the parent S:D614G clade. Finally, S:A222V does not reduce sera neutralization capacity, suggesting it does not affect vaccine effectiveness

    Machine Learning-Driven Multiscale Modeling: Bridging the Scales with a Next-Generation Simulation Infrastructure

    Get PDF
    Interdependence across time and length scales is common in biology, where atomic interactions can impact larger-scale phenomenon. Such dependence is especially true for a well-known cancer signaling pathway, where the membrane-bound RAS protein binds an effector protein called RAF. To capture the driving forces that bring RAS and RAF (represented as two domains, RBD and CRD) together on the plasma membrane, simulations with the ability to calculate atomic detail while having long time and large length- scales are needed. The Multiscale Machine-Learned Modeling Infrastructure (MuMMI) is able to resolve RAS/RAF protein-membrane interactions that identify specific lipid-protein fingerprints that enhance protein orientations viable for effector binding. MuMMI is a fully automated, ensemble-based multiscale approach connecting three resolution scales: (1) the coarsest scale is a continuum model able to simulate milliseconds of time for a 1 μm2 membrane, (2) the middle scale is a coarse-grained (CG) Martini bead model to explore protein-lipid interactions, and (3) the finest scale is an all-atom (AA) model capturing specific interactions between lipids and proteins. MuMMI dynamically couples adjacent scales in a pairwise manner using machine learning (ML). The dynamic coupling allows for better sampling of the refined scale from the adjacent coarse scale (forward) and on-the-fly feedback to improve the fidelity of the coarser scale from the adjacent refined scale (backward). MuMMI operates efficiently at any scale, from a few compute nodes to the largest supercomputers in the world, and is generalizable to simulate different systems. As computing resources continue to increase and multiscale methods continue to advance, fully automated multiscale simulations (like MuMMI) will be commonly used to address complex science questions
    corecore