73 research outputs found
RNA molecules with conserved catalytic cores but variable peripheries fold along unique energetically optimized pathways
Functional and kinetic constraints must be efficiently balanced during the folding process of all biopolymers. To understand how homologous RNA molecules with different global architectures fold into a common core structure we determined, under identical conditions, the folding mechanisms of three phylogenetically divergent group I intron ribozymes. These ribozymes share a conserved functional core defined by topologically equivalent tertiary motifs but differ in their primary sequence, size, and structural complexity. Time-resolved hydroxyl radical probing of the backbone solvent accessible surface and catalytic activity measurements integrated with structural-kinetic modeling reveal that each ribozyme adopts a unique strategy to attain the conserved functional fold. The folding rates are not dictated by the size or the overall structural complexity, but rather by the strength of the constituent tertiary motifs which, in turn, govern the structure, stability, and lifetime of the folding intermediates. A fundamental general principle of RNA folding emerges from this study: The dominant folding flux always proceeds through an optimally structured kinetic intermediate that has sufficient stability to act as a nucleating scaffold while retaining enough conformational freedom to avoid kinetic trapping. Our results also suggest a potential role of naturally selected peripheral A-minor interactions in balancing RNA structural stability with folding efficiency
Frustration in Biomolecules
Biomolecules are the prime information processing elements of living matter.
Most of these inanimate systems are polymers that compute their structures and
dynamics using as input seemingly random character strings of their sequence,
following which they coalesce and perform integrated cellular functions. In
large computational systems with a finite interaction-codes, the appearance of
conflicting goals is inevitable. Simple conflicting forces can lead to quite
complex structures and behaviors, leading to the concept of "frustration" in
condensed matter. We present here some basic ideas about frustration in
biomolecules and how the frustration concept leads to a better appreciation of
many aspects of the architecture of biomolecules, and how structure connects to
function. These ideas are simultaneously both seductively simple and perilously
subtle to grasp completely. The energy landscape theory of protein folding
provides a framework for quantifying frustration in large systems and has been
implemented at many levels of description. We first review the notion of
frustration from the areas of abstract logic and its uses in simple condensed
matter systems. We discuss then how the frustration concept applies
specifically to heteropolymers, testing folding landscape theory in computer
simulations of protein models and in experimentally accessible systems.
Studying the aspects of frustration averaged over many proteins provides ways
to infer energy functions useful for reliable structure prediction. We discuss
how frustration affects folding, how a large part of the biological functions
of proteins are related to subtle local frustration effects and how frustration
influences the appearance of metastable states, the nature of binding
processes, catalysis and allosteric transitions. We hope to illustrate how
Frustration is a fundamental concept in relating function to structural
biology.Comment: 97 pages, 30 figure
Quantitative and predictive model of kinetic regulation by E. coli TPP riboswitches.
Riboswitches are non-coding elements upstream or downstream of mRNAs that, upon binding of a specific ligand, regulate transcription and/or translation initiation in bacteria, or alternative splicing in plants and fungi. We have studied thiamine pyrophosphate (TPP) riboswitches regulating translation of thiM operon and transcription and translation of thiC operon in E. coli, and that of THIC in the plant A. thaliana. For all, we ascertained an induced-fit mechanism involving initial binding of the TPP followed by a conformational change leading to a higher-affinity complex. The experimental values obtained for all kinetic and thermodynamic parameters of TPP binding imply that the regulation by A. thaliana riboswitch is governed by mass-action law, whereas it is of kinetic nature for the two bacterial riboswitches. Kinetic regulation requires that the RNA polymerase pauses after synthesis of each riboswitch aptamer to leave time for TPP binding, but only when its concentration is sufficient. A quantitative model of regulation highlighted how the pausing time has to be linked to the kinetic rates of initial TPP binding to obtain an ON/OFF switch in the correct concentration range of TPP. We verified the existence of these pauses and the model prediction on their duration. Our analysis also led to quantitative estimates of the respective efficiency of kinetic and thermodynamic regulations, which shows that kinetically regulated riboswitches react more sharply to concentration variation of their ligand than thermodynamically regulated riboswitches. This rationalizes the interest of kinetic regulation and confirms empirical observations that were obtained by numerical simulations
On a Generalized Levinthal's Paradox: The Role of Long- and Short Range Interactions in Complex Bio-molecular Reactions, Including Protein and DNA Folding
The current protein folding literature is reviewed. Two main approaches to the problem of folding were selected for this review: geometrical and biophysical. The geometrical approach allows the formulation of topological restrictions on folding, that are usually not taken into account in the construction of physical models. In particular, the topological constraints do not allow the known funnel-like energy landscape modeling, although most common methods of resolving the paradox are based on this method. The very paradox is based on the fact that complex molecules must reach their native conformations (complexes that result from reactions) in an exponentially long time, which clearly contradicts the observed experimental data. In this respect we considered the complexity of the reactions between ligands and proteins. On this general basis, the folding-reaction paradox was reformulated and generalized. We conclude that prospects for solving the paradox should be associated with incorporating a topology aspect in biophysical models of protein folding, through the construction of hybrid models. However, such models should explicitly include long-range force fields and local cell biological conditions, such as structured water complexes and photon/phonon/soliton waves, ordered in discrete frequency bands. In this framework, collective and coherent oscillations in, and between, macromolecules are instrumental in inducing intra- and intercellular resonance, serving as an integral guiding network of life communication: the electrome aspect of the cell. Yet, to identify the actual mechanisms underlying the bonds between molecules (atoms), it will be necessary to perform dedicated experiments to more definitely solve the particular time paradox. © 2017 Elsevier Ltd.The present results were partially obtained in the frame of state task of Ministry of Education and Science of Russia 1.4539.2017/8.9
Algorithms for RNA secondary structure analysis : prediction of pseudoknots and the consensus shapes approach
Reeder J. Algorithms for RNA secondary structure analysis : prediction of pseudoknots and the consensus shapes approach. Bielefeld (Germany): Bielefeld University; 2007.Our understanding of the role of RNA has undergone a major change in the last decade. Once believed to be only a mere carrier of information and structural component of the ribosomal machinery in the advent of the genomic age, it is now clear that RNAs play a much more active role. RNAs can act as regulators and can have catalytic activity - roles previously only attributed to proteins. There is still much speculation in the scientific community as to what extent RNAs are responsible for the complexity in higher organisms which can hardly be explained with only proteins as regulators.
In order to investigate the roles of RNA, it is therefore necessary to search for new classes of RNA. For those and already known classes, analyses of their presence in different species of the tree of life will provide further insight about the evolution of biomolecules and especially RNAs. Since RNA function often follows its structure, the need for computer programs for RNA structure prediction is an immanent part of this procedure. The secondary structure of RNA - the level of base pairing - strongly determines the tertiary structure. As the latter is computationally intractable and experimentally expensive to obtain, secondary structure analysis has become an accepted substitute. In this thesis, I present two new algorithms (and a few variations thereof) for the prediction of RNA secondary structures.
The first algorithm addresses the problem of predicting a secondary structure from a single sequence including RNA pseudoknots. Pseudoknots have been shown to be functionally relevant in many RNA mediated processes. However, pseudoknots are excluded from considerations by state-of-the-art RNA folding programs for reasons of computational complexity. While folding a sequence of length n into unknotted structures requires O(n^3) time and O(n^2) space, finding the best structure including arbitrary pseudoknots has been proven to be NP-complete. Nevertheless, I demonstrate in this work that certain types of pseudoknots can be included in the folding process with only a moderate increase of computational cost.
In analogy to protein coding RNA, where a conserved encoded protein hints at a similar metabolic function, structural conservation in RNA may give clues to RNA function and to finding of RNA genes. However, structure conservation is more complex to deal with computationally than sequence conservation. The method considered to be at least conceptually the ideal approach in this situation is the Sankoff algorithm. It simultaneously aligns two sequences and predicts a common secondary structure. Unfortunately, it is computationally rather expensive - O(n^6) time and O(n^4) space for two sequences, and for more than two sequences it becomes exponential in the number of sequences! Therefore, several heuristic implementations emerged in the last decade trying to make the Sankoff approach practical by introducing pragmatic restrictions on the search space.
In this thesis, I propose to redefine the consensus structure prediction problem in a way that does not imply a multiple sequence alignment step. For a family of RNA sequences, my method explicitly and independently enumerates the near-optimal abstract shape space and predicts an abstract shape as the consensus for all sequences. For each sequence, it delivers the thermodynamically best structure which has this shape. The technique of abstract shapes analysis is employed here for a synoptic view of the suboptimal folding space. As the shape space is much smaller than the structure space, and identification of common shapes can be done in linear time (in the number of shapes considered), the method is essentially linear in the number of sequences. Evaluations show that the new method compares favorably with available alternatives
A new paradigm for the folding of ribonucleic acids
De rĂ©centes dĂ©couvertes montrent le rĂŽle important que joue lâacide ribonuclĂ©ique (ARN) au sein des cellules, que ce soit le contrĂŽle de lâexpression gĂ©nĂ©tique, la rĂ©gulation de plusieurs processus homĂ©ostasiques, en plus de la transcription et la traduction de lâacide dĂ©soxyribonuclĂ©ique (ADN) en protĂ©ine. Si lâon veut comprendre comment la cellule fonctionne, nous devons dâabords comprendre ses composantes et comment ils interagissent, et en particulier chez lâARN. La fonction dâune molĂ©cule est tributaire de sa structure tridimensionnelle (3D). Or, dĂ©terminer expĂ©rimentalement la structure 3D dâun ARN sâavĂšre fort coĂ»teux. Les mĂ©thodes courantes de prĂ©diction par ordinateur de la structure dâun ARN ne tiennent compte que des appariements classiques ou canoniques, similaires Ă ceux de la fameuse structure en double-hĂ©lice de lâADN. Ici, nous avons amĂ©liorĂ© la prĂ©diction de structures dâARN en tenant compte de tous les types possibles dâappariements, dont ceux dits non-canoniques. Cela est rendu possible dans le contexte dâun nouveau paradigme pour le repliement des ARN, basĂ© sur les motifs cycliques de nuclĂ©otides ; des blocs de bases pour la construction des ARN. De plus, nous avons dĂ©velopĂ©es de nouvelles mĂ©triques pour quantifier la prĂ©cision des mĂ©thodes de prĂ©diction des structures 3D des ARN, vue lâintroduction rĂ©cente de plusieurs de ces mĂ©thodes. Enfin, nous avons Ă©valuĂ© le pouvoir prĂ©dictif des nouvelles techniques de sondage de basse rĂ©solution des structures dâARN.Recent findings show the important role of ribonucleic acid (RNA) within the cell, be it the control of gene expression, the regulation of several homeostatic processes, in addition to the transcription and translation of deoxyribonucleic acid (DNA) into protein. If we wish to understand how the cell works, we first need to understand its components and how they interact, and in particular for RNA. The function of a molecule is tributary of its three-dimensional (3D) structure. However, experimental determination of RNA 3D structures imparts great costs. Current methods for RNA structure prediction by computers only take into account the classical or canonical base pairs, similar to those found in the well-celebrated DNA double helix. Here, we improved RNA structure prediction by taking into account all possible types of base pairs, even those said non-canonicals. This is made possible in the context of a new paradigm for the folding of RNA, based on nucleotide cyclic motifs (NCM): basic blocks for the construction of RNA. Furthermore, we have developed new metrics to quantify the precision of RNA 3D structure prediction methods, given the recent introduction of many of those methods. Finally, we have evaluated the predictive power of the latest low-resolution RNA structure probing techniques
Predicting biomolecular function from 3D dynamics : sequence-sensitive coarse-grained elastic network model coupled to machine learning
La dynamique structurelle des biomolĂ©cules est intimement liĂ©e Ă leur fonction, mais trĂšs coĂ»teuse Ă
étudier expériementalement. Pour cette raison, de nombreuses méthodologies computationnelles ont été
développées afin de simuler la dynamique structurelle biomoléculaire. Toutefois, lorsque l'on
s'intéresse à la modélisation des effects de milliers de mutations, les méthodes de simulations
classiques comme la dynamique moléculaire, que ce soit à l'échelle atomique ou gros-grain, sont trop
coûteuses pour la majorité des applications. D'autre part, les méthodes d'analyse de modes normaux
de modÚles de réseaux élastiques gros-grain (ENM pour "elastic network model") sont trÚs rapides et
procurent des solutions analytiques comprenant toutes les échelles de temps. Par contre, la majorité
des ENMs considÚrent seulement la géométrie du squelette biomoléculaire, ce qui en fait de mauvais
choix pour étudier les effets de mutations qui ne changeraient pas cette géométrie. Le "Elastic
Network Contact Model" (ENCoM) est le premier ENM sensible Ă la sĂ©quence de la biomolĂ©cule Ă
l'Ă©tude, ce qui rend possible son utilisation pour l'exploration efficace d'espaces conformationnels
complets de variants de séquence. La présente thÚse introduit le pipeline computationel
ENCoM-DynaSig-ML, qui réduit les espaces conformationnels prédits par ENCoM à des Signatures
Dynamiques qui sont ensuite utilisées pour entraßner des modÚles d'apprentissage machine simples.
ENCoM-DynaSig-ML est capable de prédire la fonction de variants de séquence avec une précision
significative, est complémentaire à toutes les méthodes existantes, et peut générer de nouvelles
hypothÚses à propos des éléments importants de dynamique structurelle pour une fonction moléculaire
donnée. Nous présentons trois exemples d'étude de relations séquence-dynamique-fonction: la
maturation des microARN, le potentiel d'activation de ligands du récepteur mu-opioïde et
l'efficacité enzymatique de l'enzyme VIM-2 lactamase. Cette application novatrice de l'analyse des
modes normaux est rapide, demandant seulement quelques secondes de temps de calcul par variant de
séquence, et est généralisable à toute biomolécule pour laquelle des données expérimentale de
mutagénÚse sont disponibles.The dynamics of biomolecules are intimately tied to their functions but experimentally elusive,
making their computational study attractive. When modelling the effects of thousands of mutations,
time-stepping methods such as classical or enhanced sampling molecular dynamics are too costly for
most applications. On the other hand, normal mode analysis of coarse-grained elastic network models
(ENMs) provides fast analytical dynamics spanning all timescales. However, the vast majority of ENMs
consider backbone geometry alone, making them a poor choice to study point mutations which do not
affect the equilibrium structure. The Elastic Network Contact Model (ENCoM) is the first
sequence-sensitive ENM, enabling its use for the efficient exploration of full conformational spaces
from sequence variants. The present work introduces the ENCoM-DynaSig-ML computational pipeline, in
which the ENCoM conformational spaces are reduced to Dynamical Signatures and coupled to simple
machine learning algorithms. ENCoM-DynaSig-ML predicts the function of sequence variants with
significant accuracy, is complementary to all existing methods, and can generate new hypotheses
about which dynamical features are important for the studied biomolecule's function. Examples given
are the maturation efficiency of microRNA variants, the activation potential of mu-opioid receptor
ligands and the effect of point mutations on VIM-2 lactamase's enzymatic efficiency. This novel
application of normal mode analysis is very fast, taking a few seconds CPU time per variant, and is
generalizable to any biomolecule on which experimental mutagenesis data exist
Using SetPSO to determine RNA secondary structure
RNA secondary structure prediction is an important field in Bioinformatics. A number of different approaches have been developed to simplify the determination of RNA molecule structures. RNA is a nucleic acid found in living organisms which fulfils a number of important roles in living cells. Knowledge of its structure is crucial in the understanding of its function. Determining RNA secondary structure computationally, rather than by physical means, has the advantage of being a quicker and cheaper method. This dissertation introduces a new Set-based Particle Swarm Optimisation algorithm, known as SetPSO for short, to optimise the structure of an RNA molecule, using an advanced thermodynamic model. Structure prediction is modelled as an energy minimisation problem. Particle swarm optimisation is a simple but effective stochastic optimisation technique developed by Kennedy and Eberhart. This simple technique was adapted to work with variable length particles which consist of a set of elements rather than a vector of real numbers. The effectiveness of this structure prediction approach was compared to that of a dynamic programming algorithm called mfold. It was found that SetPSO can be used as a combinatorial optimisation technique which can be applied to the problem of RNA secondary structure prediction. This research also included an investigation into the behaviour of the new SetPSO optimisation algorithm. Further study needs to be conducted to evaluate the performance of SetPSO on different combinatorial and set-based optimisation problems.Dissertation (MS)--University of Pretoria, 2009.Computer Scienceunrestricte
The prediction and analysis of protein structure using specialized database techniques
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biology, 1995.Includes bibliographical references.by Tau-Mu Yi.Ph.D
- âŠ