86 research outputs found

    Benchmarking coarse-grained models of organic semiconductors via deep backmapping

    Get PDF
    The potential of mean force is an effective coarse-grained potential, which is often approximated by pairwise potentials. While the approximated potential reproduces certain distributions of the reference all-atom model with remarkable accuracy, important cross-correlations are typically not captured. In general, the quality of coarse-grained models is evaluated at the coarse-grained resolution, hindering the detection of important discrepancies between the all-atom and coarse-grained ensembles. In this work, the quality of different coarse-grained models is assessed at the atomistic resolution deploying reverse-mapping strategies. In particular, coarse-grained structures for Tris-Meta-Biphenyl-Triazine are reverse-mapped from two different sources: 1) All-atom configurations projected onto the coarse-grained resolution and 2) snapshots obtained by molecular dynamics simulations based on the coarse-grained force fields. To assess the quality of the coarse-grained models, reverse-mapped structures of both sources are compared revealing significant discrepancies between the all-atom and the coarse-grained ensembles. Specifically, the reintroduced details enable force computations based on the all-atom force field that yield a clear ranking for the quality of the different coarse-grained models

    Non-covalent interactions across organic and biological subsets of chemical space: Physics-based potentials parametrized from machine learning

    Get PDF
    Classical intermolecular potentials typically require an extensive parametrization procedure for any new compound considered. To do away with prior parametrization, we propose a combination of physics-based potentials with machine learning (ML), coined IPML, which is transferable across small neutral organic and biologically-relevant molecules. ML models provide on-the-fly predictions for environment-dependent local atomic properties: electrostatic multipole coefficients (significant error reduction compared to previously reported), the population and decay rate of valence atomic densities, and polarizabilities across conformations and chemical compositions of H, C, N, and O atoms. These parameters enable accurate calculations of intermolecular contributions---electrostatics, charge penetration, repulsion, induction/polarization, and many-body dispersion. Unlike other potentials, this model is transferable in its ability to handle new molecules and conformations without explicit prior parametrization: All local atomic properties are predicted from ML, leaving only eight global parameters---optimized once and for all across compounds. We validate IPML on various gas-phase dimers at and away from equilibrium separation, where we obtain mean absolute errors between 0.4 and 0.7 kcal/mol for several chemically and conformationally diverse datasets representative of non-covalent interactions in biologically-relevant molecules. We further focus on hydrogen-bonded complexes---essential but challenging due to their directional nature---where datasets of DNA base pairs and amino acids yield an extremely encouraging 1.4 kcal/mol error. Finally, and as a first look, we consider IPML in denser systems: water clusters, supramolecular host-guest complexes, and the benzene crystal.Comment: 15 pages, 9 figure

    Data-driven discovery of cardiolipin-selective small molecules by computational active learning

    Get PDF
    Subtle variations in the lipid composition of mitochondrial membranes can have a profound impact on mitochondrial function. The inner mitochondrial membrane contains the phospholipid cardiolipin, which has been demonstrated to act as a biomarker for a number of diverse pathologies. Small molecule dyes capable of selectively partitioning into cardiolipin membranes enable visualization and quantification of the cardiolipin content. Here we present a data-driven approach that combines a deep learning-enabled active learning workflow with coarse-grained molecular dynamics simulations and alchemical free energy calculations to discover small organic compounds able to selectively permeate cardiolipin-containing membranes. By employing transferable coarse-grained models we efficiently navigate the all-atom design space corresponding to small organic molecules with molecular weight less than ≈500 Da. After direct simulation of only 0.42% of our coarse-grained search space we identify molecules with considerably increased levels of cardiolipin selectivity compared to a widely used cardiolipin probe 10-N-nonyl acridine orange. Our accumulated simulation data enables us to derive interpretable design rules linking coarse-grained structure to cardiolipin selectivity. The findings are corroborated by fluorescence anisotropy measurements of two compounds conforming to our defined design rules. Our findings highlight the potential of coarse-grained representations and multiscale modelling for materials discovery and design

    The Bacteriostatic Activity of 2-Phenylethanol Derivatives Correlates with Membrane Binding Affinity

    Get PDF
    The hydrophobic tails of aliphatic primary alcohols do insert into the hydrophobic core of a lipid bilayer. Thereby, they disrupt hydrophobic interactions between the lipid molecules, resulting in a decreased lipid order, i.e., an increased membrane fluidity. While aromatic alcohols, such as 2-phenylethanol, also insert into lipid bilayers and disturb the membrane organization, the impact of aromatic alcohols on the structure of biological membranes, as well as the potential physiological implication of membrane incorporation has only been studied to a limited extent. Although diverse targets are discussed to be causing the bacteriostatic and bactericidal activity of 2-phenylethanol, it is clear that 2-phenylethanol severely affects the structure of biomembranes, which has been linked to its bacteriostatic activity. Yet, in fungi some 2-phenylethanol derivatives are also produced, some of which appear to also have bacteriostatic activities. We showed that the 2-phenylethanol derivatives phenylacetic acid, phenyllactic acid, and methyl phenylacetate, but not Tyrosol, were fully incorporated into model membranes and affected the membrane organization. Furthermore, we observed that the propensity of the herein-analyzed molecules to partition into biomembranes positively correlated with their respective bacteriostatic activity, which clearly linked the bacteriotoxic activity of the substances to biomembranes

    Atomic-scale representation and statistical learning of tensorial properties

    Full text link
    This chapter discusses the importance of incorporating three-dimensional symmetries in the context of statistical learning models geared towards the interpolation of the tensorial properties of atomic-scale structures. We focus on Gaussian process regression, and in particular on the construction of structural representations, and the associated kernel functions, that are endowed with the geometric covariance properties compatible with those of the learning targets. We summarize the general formulation of such a symmetry-adapted Gaussian process regression model, and how it can be implemented based on a scheme that generalizes the popular smooth overlap of atomic positions representation. We give examples of the performance of this framework when learning the polarizability and the ground-state electron density of a molecule

    Shared Metadata for Data-Centric Materials Science

    Get PDF
    The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles (Findable, Accessible, Interoperable, and Reusable) must not be too narrow. Besides, the wider materials-science community ought to agree on the strategies to tackle the challenges that are specific to its data, both from computations and experiments. In this paper, we present the result of the discussions held at the workshop on "Shared Metadata and Data Formats for Big-Data Driven Materials Science". We start from an operative definition of metadata, and what features a FAIR-compliant metadata schema should have. We will mainly focus on computational materials-science data and propose a constructive approach for the FAIRification of the (meta)data related to ground-state and excited-states calculations, potential-energy sampling, and generalized workflows. Finally, challenges with the FAIRification of experimental (meta)data and materials-science ontologies are presented together with an outlook of how to meet them
    corecore