934 research outputs found
Escaping free-energy minima
We introduce a novel and powerful method for exploring the properties of the
multidimensional free energy surfaces of complex many-body systems by means of
a coarse-grained non-Markovian dynamics in the space defined by a few
collective coordinates.A characteristic feature of this dynamics is the
presence of a history-dependent potential term that, in time, fills the minima
in the free energy surface, allowing the efficient exploration and accurate
determination of the free energy surface as a function of the collective
coordinates. We demonstrate the usefulness of this approach in the case of the
dissociation of a NaCl molecule in water and in the study of the conformational
changes of a dialanine in solution.Comment: 3 figure
Clustering by fast search-and-find of density peaks
Cluster analysis is aimed at classifying elements into categories on the basis of their similarity. Its applications range from astronomy to bioinformatics, bibliometrics, and pattern recognition.We propose an approach based on the idea that cluster centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities. This idea forms the basis of a clustering procedure in which the number of clusters arises intuitively, outliers are automatically spotted and excluded fromthe analysis, and clusters are recognized regardless of their shape and of the dimensionality of the space inwhich they are embedded.We demonstrate the power of the algorithm on several test cases
Intrinsic dimension as a multi-scale summary statistics in network modeling
Complex networks are powerful mathematical tools for modelling and understanding the behaviour of highly interconnected systems. However, existing methods for analyzing these networks focus on local properties (e.g. degree distribution, clustering coefficient) or global properties (e.g. diameter, modularity) and fail to characterize the network structure across multiple scales. In this paper, we introduce a rigorous method for calculating the intrinsic dimension of unweighted networks. The intrinsic dimension is a feature that describes the network structure at all scales, from local to global. We propose using this measure as a summary statistic within an Approximate Bayesian Computation framework to infer the parameters of flexible and multi-purpose mechanistic models that generate complex networks. Furthermore, we present a new mechanistic model that can reproduce the intrinsic dimension of networks with large diameters, a task that has been challenging for existing models
Predicting crystal structures: the Parrinello-Rahman method revisited
By suitably adapting a recent approach [A. Laio and M. Parrinello, PNAS, 99,
12562 (2002)] we develop a powerful molecular dynamics method for the study of
pressure-induced structural transformations. We use the edges of the simulation
cell as collective variables. In the space of these variables we define a
metadynamics that drives the system away from the local minimum towards a new
crystal structure. In contrast to the Parrinello-Rahman method our approach
shows no hysteresis and crystal structure transformations can occur at the
equilibrium pressure. We illustrate the power of the method by studying the
pressure-induced diamond to simple hexagonal phase transition in a model of
silicon.Comment: 5 pages, 2 Postscript figures, submitte
Assessing the capability of in silico mutation protocols for predicting the finite temperature conformation of amino acids
Mutation protocols are a key tool in computational biophysics for modelling unknown side chain conformations. In particular, these protocols are used to generate the starting structures for molecular dynamics simulations. The accuracy of the initial side chain and backbone placement is crucial to obtain a stable and quickly converging simulation. In this work, we assessed the performance of several mutation protocols in predicting the most probable conformer observed in finite temperature molecular dynamics simulations for a set of protein-peptide crystals differing only by single-point mutations in the peptide sequence. Our results show that several programs which predict well the crystal conformations fail to predict the most probable finite temperature configuration. Methods relying on backbone-dependent rotamer libraries have, in general, a better performance, but even the best protocol fails in predicting approximately 30% of the mutations
Candidate Binding Sites for Allosteric Inhibition of the SARS-CoV-2 Main Protease from the Analysis of Large-Scale Molecular Dynamics Simulations
We analyzed a 100 μs MD trajectory of the SARS-CoV-2 main protease by a non-parametric data analysis approach which allows characterizing a free energy landscape as a simultaneous function of hundreds of variables. We identified several conformations that, when visited by the dynamics, are stable for several hundred nanoseconds. We explicitly characterize and describe these metastable states. In some of these configurations, the catalytic dyad is less accessible. Stabilizing them by a suitable binder could lead to an inhibition of the enzymatic activity. In our analysis we keep track of relevant contacts between residues which are selectively broken or formed in the states. Some of these contacts are formed by residues which are far from the catalytic dyad and are accessible to the solvent. Based on this analysis we propose some relevant contact patterns and three possible binding sites which could be targeted to achieve allosteric inhibition
The intrinsic dimension of protein sequence evolution
It is well known that, in order to preserve its structure and function, a protein cannot change its sequence at random, but only by mutations occurring preferentially at specific locations. We here investigate quantitatively the amount of variability that is allowed in protein sequence evolution, by computing the intrinsic dimension (ID) of the sequences belonging to a selection of protein families. The ID is a measure of the number of independent directions that evolution can take starting from a given sequence. We find that the ID is practically constant for sequences belonging to the same family, and moreover it is very similar in different families, with values ranging between 6 and 12. These values are significantly smaller than the raw number of amino acids, confirming the importance of correlations between mutations in different sites. However, we demonstrate that correlations are not sufficient to explain the small value of the ID we observe in protein families. Indeed, we show that the ID of a set of protein sequences generated by maximum entropy models, an approach in which correlations are accounted for, is typically significantly larger than the value observed in natural protein families. We further prove that a critical factor to reproduce the natural ID is to take into consideration the phylogeny of sequences
Intrinsic dimension of data representations in deep neural networks
Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds
Metadynamics Simulations Reveal a Na+ Independent Exiting Path of Galactose for the Inward-Facing Conformation of vSGLT
Sodium-Galactose Transporter (SGLT) is a secondary active symporter which accumulates sugars into cells by using the electrochemical gradient of Na+ across the membrane. Previous computational studies provided insights into the release process of the two ligands (galactose and sodium ion) into the cytoplasm from the inward-facing conformation of Vibrio parahaemolyticus sodium/galactose transporter (vSGLT). Several aspects of the transport mechanism of this symporter remain to be clarified: (i) a detailed kinetic and thermodynamic characterization of the exit path of the two ligands is still lacking; (ii) contradictory conclusions have been drawn concerning the gating role of Y263; (iii) the role of Na+ in modulating the release path of galactose is not clear. In this work, we use bias-exchange metadynamics simulations to characterize the free energy profile of the galactose and Na+ release processes toward the intracellular side. Surprisingly, we find that the exit of Na+ and galactose is non-concerted as the cooperativity between the two ligands is associated to a transition that is not rate limiting. The dissociation barriers are of the order of 11-12 kcal/mol for both the ion and the substrate, in line with kinetic information concerning this type of transporters. On the basis of these results we propose a branched six-state alternating access mechanism, which may be shared also by other members of the LeuT-fold transporters
- …