12,068 research outputs found

    Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways

    Full text link
    Diverse classes of proteins function through large-scale conformational changes; sophisticated enhanced sampling methods have been proposed to generate these macromolecular transition paths. As such paths are curves in a high-dimensional space, they have been difficult to compare quantitatively, a prerequisite to, for instance, assess the quality of different sampling algorithms. The Path Similarity Analysis (PSA) approach alleviates these difficulties by utilizing the full information in 3N-dimensional trajectories in configuration space. PSA employs the Hausdorff or Fr\'echet path metrics---adopted from computational geometry---enabling us to quantify path (dis)similarity, while the new concept of a Hausdorff-pair map permits the extraction of atomic-scale determinants responsible for path differences. Combined with clustering techniques, PSA facilitates the comparison of many paths, including collections of transition ensembles. We use the closed-to-open transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for the assessment enhanced sampling algorithms---to examine multiple microsecond equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free form alongside transition ensembles from the MD-based dynamic importance sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for instance, that differences in DIMS-MD and FRODA paths were mediated by a set of conserved salt bridges whose charge-charge interactions are fully modeled in DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis methods relying on pre-defined collective variables, such as native contacts or geometric quantities, can be used synergistically with PSA, as well as the application of PSA to more complex systems such as membrane transporter proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also available from journal site

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors

    Full text link
    An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only CαC_\alpha or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue specific reduced discrete state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or CαC_\alpha atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side chain centers or coordinates of all side chain atoms. By reducing the residue alphabets down to size 5 for local structure-sequence relationship, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein

    Structure and Properties of Simple and Aggregate Systems by Circular Dichroism Spectroscopy

    Get PDF
    This thesis deals with the investigation of structural properties of many different systems via Electronic Circular Dichroism (ECD). The interpretation of experimental data has been carried out mainly with quantum-chemistry methods, such as Density Functional Theory (DFT), on both solution and solid-state systems. The analysis of solution systems is oriented towards applications on biologically active compounds, both natural or synthetic, and its objective is to underline the key role of these approaches in the determination of the absolute configuration and the difficulties that may be encountered in case of flexible molecules. Solid-state measurements represent an attractive alternative to these cases where a lot of conformations are present, but difficulties in the interpretation of the signals due to solid-state interactions which are not observable in solution may be faced. For a better understanding of spectral lineshapes, more detailed analyses have been performed taking into account vibronic effects, which may also assist in the determination of the conformational situation of the investigated substrate. The limitations of the vibronic treatment for coupled electronic states have been considered, leading to a general all-coordinate approach which allows simulating the electronic spectrum of “dimeric” molecules with weakly coupled electronic states through a time dependent approach

    Lectin ligands: New insights into their conformations and their dynamic behavior and the discovery of conformer selection by lectins

    Get PDF
    The mysteries of the functions of complex glycoconjugates have enthralled scientists over decades. Theoretical considerations have ascribed an enormous capacity to store information to oligosaccharides, In the interplay with lectins sugar-code words of complex carbohydrate structures can be deciphered. To capitalize on knowledge about this type of molecular recognition for rational marker/drug design, the intimate details of the recognition process must be delineated, To this aim the required approach is garnered from several fields, profiting from advances primarily in X-ray crystallography, nuclear magnetic resonance spectroscopy and computational calculations encompassing molecular mechanics, molecular dynamics and homology modeling. Collectively considered, the results force us to jettison the preconception of a rigid ligand structure. On the contrary, a carbohydrate ligand may move rather freely between two or even more low-energy positions, affording the basis for conformer selection by a lectin. By an exemplary illustration of the interdisciplinary approach including up-to-date refinements in carbohydrate modeling it is underscored why this combination is considered to show promise of fostering innovative strategies in rational marker/drug design
    corecore