26 research outputs found

    Probing defects and correlations in the hydrogen-bond network of ab initio water

    Full text link
    The hydrogen-bond network of water is characterized by the presence of coordination defects relative to the ideal tetrahedral network of ice, whose fluctuations determine the static and time-dependent properties of the liquid. Because of topological constraints, such defects do not come alone, but are highly correlated coming in a plethora of different pairs. Here we discuss in detail such correlations in the case of ab initio water models and show that they have interesting similarities to regular and defective solid phases of water. Although defect correlations involve deviations from idealized tetrahedrality, they can still be regarded as weaker hydrogen bonds that retain a high degree of directionality. We also investigate how the structure and population of coordination defects is affected by approximations to the inter-atomic potential, finding that in most cases, the qualitative features of the hydrogen bond network are remarkably robust

    An Automatic, Data-Driven Definition of Atomic-Scale Structural Motifs

    Get PDF
    Structure-property relationships at the atomic scale are usually understood in terms of recurrent structural motifs formed by atoms and molecules, and how they transform and interact with each other. We introduce with this thesis a novel analysis approach, capable of determining such patterns automatically. This analysis provides a unique fingerprint for metastable motifs, that is based exclusively on structural information. The rational behind the method and its functioning will be presented, followed by a discussion regarding its application to a wide range of problems in materials science and biology. We will begin by showing how it is possible to use our methodology to define adaptively the hydrogen bond in some different systems, including water, ammonia and peptides. We will then demonstrate how such definition can be used to probe the topological defects in the 3-dimensional hydrogen bond network of liquid water and will propose a method to study the non-trivial correlations among them. Furthermore, we will apply our framework to the identification of coordination environments in nanoclusters, and to the recognition of secondary-structure patterns in oligopeptides and proteins. We will prove that it is not only possible to obtain an algorithmic definition, which is unbiased and adaptive, of local motifs of matter, but also to identify and classify structures in their entirety. We will also demonstrate that a clear interpretation of the stability of the system can be obtained through the automatic analysis of atomistic simulation results, and will discuss possible applications, such as the definition of collective variables for enhanced-sampling simulation techniques or the identification of recurrent patterns in complex systems that escape an interpretation in terms of conventional structural motifs, such as intrinsically disordered proteins

    Anharmonic and Quantum Fluctuations in Molecular Crystals: A First-Principles Study of the Stability of Paracetamol

    Get PDF
    Molecular crystals often exist in multiple competing polymorphs, showing significantly different physicochemical properties. Computational crystal structure prediction is key to interpret and guide the search for the most stable or useful form, a real challenge due to the combinatorial search space, and the complex interplay of subtle effects that work together to determine the relative stability of different structures. Here we take a comprehensive approach based on different flavors of thermodynamic integration in order to estimate all contributions to the free energies of these systems with density-functional theory, including the oft-neglected anharmonic contributions and nuclear quantum effects. We take the two main stable forms of paracetamol as a paradigmatic example. We find that anharmonic contributions, different descriptions of van der Waals interactions, and nuclear quantum effects all matter to quantitatively determine the stability of different phases. Our analysis highlights the many challenges inherent in the development of a quantitative and predictive framework to model molecular crystals. However, it also indicates which of the components of the free energy can benefit from a cancellation of errors that can redeem the predictive power of approximate models, and suggests simple steps that could be taken to improve the reliability of ab initio crystal structure prediction

    HydraScreen: A Generalizable Structure-Based Deep Learning Approach to Drug Discovery

    Full text link
    We propose HydraScreen, a deep-learning approach that aims to provide a framework for more robust machine-learning-accelerated drug discovery. HydraScreen utilizes a state-of-the-art 3D convolutional neural network, designed for the effective representation of molecular structures and interactions in protein-ligand binding. We design an end-to-end pipeline for high-throughput screening and lead optimization, targeting applications in structure-based drug design. We assess our approach using established public benchmarks based on the CASF 2016 core set, achieving top-tier results in affinity and pose prediction (Pearson's r = 0.86, RMSE = 1.15, Top-1 = 0.95). Furthermore, we utilize a novel interaction profiling approach to identify potential biases in the model and dataset to boost interpretability and support the unbiased nature of our method. Finally, we showcase HydraScreen's capacity to generalize across unseen proteins and ligands, offering directions for future development of robust machine learning scoring functions. HydraScreen (accessible at https://hydrascreen.ro5.ai) provides a user-friendly GUI and a public API, facilitating easy assessment of individual protein-ligand complexes

    Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank

    Get PDF
    Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns

    Reconstructing the infrared spectrum of a peptide from representative conformers of the full canonical ensemble

    Get PDF
    Leucine enkephalin (LeuEnk), a biologically active endogenous opioid pentapeptide, has been under intense investigation because it is small enough to allow efficient use of sophisticated computational methods and large enough to provide insights into low-lying minima of its conformational space. Here, we reproduce and interpret experimental infrared (IR) spectra of this model peptide in gas phase using a combination of replica-exchange molecular dynamics simulations, machine learning, and ab initio calculations. In particular, we evaluate the possibility of averaging representative structural contributions to obtain an accurate computed spectrum that accounts for the corresponding canonical ensemble of the real experimental situation. Representative conformers are identified by partitioning the conformational phase space into subensembles of similar conformers. The IR contribution of each representative conformer is calculated from ab initio and weighted according to the population of each cluster. Convergence of the averaged IR signal is rationalized by merging contributions in a hierarchical clustering and the comparison to IR multiple photon dissociation experiments. The improvements achieved by decomposing clusters containing similar conformations into even smaller subensembles is strong evidence that a thorough assessment of the conformational landscape and the associated hydrogen bonding is a prerequisite for deciphering important fingerprints in experimental spectroscopic data.</p
    corecore