620 research outputs found

    Complete Configuration Space Analysis for Structure Determination of Symmetric Homo-oligomers by NMR

    Get PDF
    Symmetric homo-oligomers (protein complexes with similar subunits arranged symmetrically) play pivotal roles in complex biological processes such as ion transport and cellular regulation. Structure determination of these complexes is necessary in order to gain valuable insights into their mechanisms. Nuclear Magnetic Resonance (NMR) spectroscopy is an experimental technique used for structural studies of such complexes. The data available for structure determination of symmetric homo-oligomers by NMR is often sparse and ambiguous in nature, raising concerns about existing heuristic approaches for structure determination. We have developed an approach that is complete in that it identifies all consistent conformations, data-driven in that it separately evaluates the consistency of structures to data and biophysical constraints and efficient in that it avoids explicit consideration of each of the possible structures separately. By being complete, we ensure that native conformations are not missed. By being data-driven, we are able to separately quantify the information content in the data alone versus data and biophysical modeling. We take a configuration space (degree-of-freedom) approach that provides a compact representation of the conformation space and enables us to efficiently explore the space of possible conformations. This thesis demonstrates that the configuration space-based method is robust to sparsity and ambiguity in the data and enables complete, data-driven and efficient structure determination of symmetric homo-oligomers

    Evolving Cellular Automata Schemes for Protein Folding Modeling Using the Rosetta Atomic Representation

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG [Abstract] Protein folding is the dynamic process by which a protein folds into its final native structure. This is different to the traditional problem of the prediction of the final protein structure, since it requires a modeling of how protein components interact over time to obtain the final folded structure. In this study we test whether a model of the folding process can be obtained exclusively through machine learning. To this end, protein folding is considered as an emergent process and the cellular automata tool is used to model the folding process. A neural cellular automaton is defined, using a connectionist model that acts as a cellular automaton through the protein chain to define the dynamic folding. Differential evolution is used to automatically obtain the optimized neural cellular automata that provide protein folding. We tested the methods with the Rosetta coarse-grained atomic model of protein representation, using different proteins to analyze the modeling of folding and the structure refinement that the modeling can provide, showing the potential advantages that such methods offer, but also difficulties that arise.This study was funded by the Xunta de Galicia and the European Union (European Regional Development Fund - Galicia 2014-2020 Program), with grants CITIC (ED431G 2019/01), GPC ED431B 2019/03 and IN845D-02 (funded by the “Agencia Gallega de Innovación”, co-financed by Feder funds), and by the Spanish Ministry of Science and Innovation (project PID2020-116201GB-I00). Open Access funding provided thanks to the CRUE-CSIC agreement with Springer NatureXunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431B 2019/03Xunta de Galicia; IN845D-0

    Improved Constrained Global Optimization for Estimating Molecular Structure From Atomic Distances

    Get PDF
    Determination of molecular structure is commonly posed as a nonlinear optimization problem. The objective functions rely on a vast amount of structural data. As a result, the objective functions are most often nonconvex, nonsmooth, and possess many local minima. Furthermore, introduction of additional structural data into the objective function creates barriers in finding the global minimum, causes additional computational issues associated with evaluating the function, and makes physical constraint enforcement intractable. To combat the computational problems associated with standard nonlinear optimization formulations, Williams et al. (2001) proposed an atom-based optimization, referred to as GNOMAD, which complements a simple interatomic distance potential with van der Waals (VDW) constraints to provide better quality protein structures. However, the improvement in more detailed structural features such as shape and chirality requires the integration of additional constraint types. This dissertation builds on the GNOMAD algorithm in using structural data to estimate the three-dimensional structure of a protein. We develop several methods to make GNOMAD capable of effectively and efficiently handling non-distance information including torsional angles and molecular surface data. In specific, we propose a method for using distances to effectively satisfy known torsional information and show that use of this method results in a significant improvement in the quality of α-helices and β-strands within the protein. We also show that molecular surface data in combination with our improved secondary structure estimation method and long-range distance data offer increased accuracy in spatial proximity of α-helices and β-strands within the protein, and thus provide better estimates of tertiary protein structure. Lastly, we show that the enhanced GNOMAD molecular structure estimation framework is effective in predicting protein structures in the context of comparative modeling

    Evolutionary Computation

    Get PDF
    This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field

    Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

    Get PDF
    Why is an amino acid replacement in a protein accepted during evolution? The answer given by bioinformatics relies on the frequency of change of each amino acid by another one and the propensity of each to remain unchanged. We propose that these replacement rules are recoverable from the secondary structural trends of amino acids. A distance measure between high-resolution Ramachandran distributions reveals that structurally similar residues coincide with those found in substitution matrices such as BLOSUM: Asn Asp, Phe Tyr, Lys Arg, Gln Glu, Ile Val, Met → Leu; with Ala, Cys, His, Gly, Ser, Pro, and Thr, as structurally idiosyncratic residues. We also found a high average correlation (\overline{R} R = 0.85) between thirty amino acid mutability scales and the mutational inertia (I X ), which measures the energetic cost weighted by the number of observations at the most probable amino acid conformation. These results indicate that amino acid substitutions follow two optimally-efficient principles: (a) amino acids interchangeability privileges their secondary structural similarity, and (b) the amino acid mutability depends directly on its biosynthetic energy cost, and inversely with its frequency. These two principles are the underlying rules governing the observed amino acid substitutions. © 2017 The Author(s)
    corecore