1,443 research outputs found

    SIMS: A Hybrid Method for Rapid Conformational Analysis

    Get PDF
    Proteins are at the root of many biological functions, often performing complex tasks as the result of large changes in their structure. Describing the exact details of these conformational changes, however, remains a central challenge for computational biology due the enormous computational requirements of the problem. This has engendered the development of a rich variety of useful methods designed to answer specific questions at different levels of spatial, temporal, and energetic resolution. These methods fall largely into two classes: physically accurate, but computationally demanding methods and fast, approximate methods. We introduce here a new hybrid modeling tool, the Structured Intuitive Move Selector (SIMS), designed to bridge the divide between these two classes, while allowing the benefits of both to be seamlessly integrated into a single framework. This is achieved by applying a modern motion planning algorithm, borrowed from the field of robotics, in tandem with a well-established protein modeling library. SIMS can combine precise energy calculations with approximate or specialized conformational sampling routines to produce rapid, yet accurate, analysis of the large-scale conformational variability of protein systems. Several key advancements are shown, including the abstract use of generically defined moves (conformational sampling methods) and an expansive probabilistic conformational exploration. We present three example problems that SIMS is applied to and demonstrate a rapid solution for each. These include the automatic determination of ムムactiveメメ residues for the hinge-based system Cyanovirin-N, exploring conformational changes involving long-range coordinated motion between non-sequential residues in Ribose- Binding Protein, and the rapid discovery of a transient conformational state of Maltose-Binding Protein, previously only determined by Molecular Dynamics. For all cases we provide energetic validations using well-established energy fields, demonstrating this framework as a fast and accurate tool for the analysis of a wide range of protein flexibility problems

    Algorithmes pour le (dés)assemblage d'objets complexes et applications à la biologie structurale

    Get PDF
    La compréhension et la prédiction des relations structure-fonction de protéines par des approches in sillico représentent aujourd'hui un challenge. Malgré le développement récent de méthodes algorithmiques pour l'étude du mouvement et des interactions moléculaires, la flexibilité de macromolécules reste largement hors de portée des outils actuels de modélisation moléculaire. L'objectif de cette thèse est de développer une nouvelle approche basée sur des algorithmes de planification de mouvement issus de la robotique pour mieux traiter la flexibilité moléculaire dans l'étude des interactions protéiques. Nous avons étendu un algorithme récent d'exploration par échantillonnage aléatoire, ML-RRT pour le désassemblage d'objets articulés complexes. Cet algorithme repose sur la décomposition des paramètres de configuration en deux sous-ensembles actifs et passifs, qui sont traités de manière découplée. Les extensions proposées permettent de considérer plusieurs degrés de mobilité pour la partie passive, qui peut être poussée ou attirée par la partie active. Cet outil algorithmique a été appliqué avec succès pour l'étude des changements conformationnels de protéines induits lors de la diffusion d'un ligand. A partir de cette extension, nous avons développé une nouvelle méthode pour la résolution simultanée du séquençage et des mouvements de désassemblage entre plusieurs objets. La méthode, nommée Iterative-ML-RRT, calcule non seulement les trajectoires permettant d'extraire toutes les pièces d'un objet complexe assemblé, mais également l'ordre permettant le désassemblage. L'approche est générale et a été appliquée pour l'étude du processus de dissociation de complexes macromoléculaires en introduisant une fonction d'évaluation basée sur l'énergie d'interaction. Les résultats présentés dans cette thèse montrent non seulement l'efficacité mais aussi la généralité des algorithmes proposés. ABSTRACT : Understanding and predicting structure-function relationships in proteins with fully in silico approaches remain today a great challenge. Despite recent developments of computational methods for studying molecular motions and interactions, dealing with macromolecular flexibility largely remains out of reach of the existing molecular modeling tools. The aim of this thesis is to develop a novel approach based on motion planning algorithms originating from robotics to better deal with macromolecular flexibility in protein interaction studies. We have extended a recent sampling-based algorithm, ML-RRT, for (dis)-assembly path planning of complex articulated objects. This algorithm is based on a partition of the configuration parameters into active and passive subsets, which are then treated in a decoupled manner. The presented extensions permit to consider different levels of mobility for the passive parts that can be pushed or pulled by the motion of active parts. This algorithmic tool is successfully applied to study protein conformational changes induced by the diffusion of a ligand inside it. Building on the extension of ML-RRT, we have developed a novel method for simultaneously (dis)assembly sequencing and path planning. The new method, called Iterative-ML-RRT, computes not only the paths for extracting all the parts from a complex assembled object, but also the preferred order that the disassembly process has to follow. We have applied this general approach for studying disassembly pathways of macromolecular complexes considering a scoring function based on the interaction energy. The results described in this thesis prove not only the efficacy but also the generality of the proposed algorithm

    Conformational and functional analysis of molecular dynamics trajectories by Self-Organising Maps

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Molecular dynamics (MD) simulations are powerful tools to investigate the conformational dynamics of proteins that is often a critical element of their function. Identification of functionally relevant conformations is generally done clustering the large ensemble of structures that are generated. Recently, Self-Organising Maps (SOMs) were reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data mining problems. We present a novel strategy to analyse and compare conformational ensembles of protein domains using a two-level approach that combines SOMs and hierarchical clustering.</p> <p>Results</p> <p>The conformational dynamics of the α-spectrin SH3 protein domain and six single mutants were analysed by MD simulations. The Cα's Cartesian coordinates of conformations sampled in the essential space were used as input data vectors for SOM training, then complete linkage clustering was performed on the SOM prototype vectors. A specific protocol to optimize a SOM for structural ensembles was proposed: the optimal SOM was selected by means of a Taguchi experimental design plan applied to different data sets, and the optimal sampling rate of the MD trajectory was selected. The proposed two-level approach was applied to single trajectories of the SH3 domain independently as well as to groups of them at the same time. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular structures: the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions.</p> <p>Conclusions</p> <p>The use of a two-level approach combining SOMs and hierarchical clustering for conformational analysis of structural ensembles of proteins was proposed. It can easily be extended to other study cases and to conformational ensembles from other sources.</p

    (Dis)assembly path planning for complex objects and applications to structural biology

    Get PDF
    Understanding and predicting structure-function relationships in proteins with fully in silico approaches remain today a great challenge. Despite recent developments of computational methods for studying molecular motions and interactions, dealing with macromolecular flexibility largely remains out of reach of the existing molecular modeling tools. The aim of this thesis is to develop a novel approach based on motion planning algorithms originating from robotics to better deal with macromolecular flexibility in protein interaction studies. We have extended a recent sampling-based algorithm, ML-RRT, for (dis)-assembly path planning of complex articulated objects. This algorithm is based on a partition of the configuration parameters into active and passive subsets, which are then treated in a decoupled manner. The presented extensions permit to consider different levels of mobility for the passive parts that can be pushed or pulled by the motion of active parts. This algorithmic tool is successfully applied to study protein conformational changes induced by the diffusion of a ligand inside it. Building on the extension of ML-RRT, we have developed a novel method for simultaneously (dis)assembly sequencing and path planning. The new method, called Iterative-ML-RRT, computes not only the paths for extracting all the parts from a complex assembled object, but also the preferred order that the disassembly process has to follow. We have applied this general approach for studying disassembly pathways of macromolecular complexes considering a scoring function based on the interaction energy. The results described in this thesis prove not only the efficacy but also the generality of the proposed algorithm

    A Constraint Solver for Flexible Protein Models

    Get PDF
    This paper proposes the formalization and implementation of a novel class of constraints aimed at modeling problems related to placement of multi-body systems in the 3-dimensional space. Each multi-body is a system composed of body elements, connected by joint relationships and constrained by geometric properties. The emphasis of this investigation is the use of multi-body systems to model native conformations of protein structures---where each body represents an entity of the protein (e.g., an amino acid, a small peptide) and the geometric constraints are related to the spatial properties of the composing atoms. The paper explores the use of the proposed class of constraints to support a variety of different structural analysis of proteins, such as loop modeling and structure prediction. The declarative nature of a constraint-based encoding provides elaboration tolerance and the ability to make use of any additional knowledge in the analysis studies. The filtering capabilities of the proposed constraints also allow to control the number of representative solutions that are withdrawn from the conformational space of the protein, by means of criteria driven by uniform distribution sampling principles. In this scenario it is possible to select the desired degree of precision and/or number of solutions. The filtering component automatically excludes configurations that violate the spatial and geometric properties of the composing multi-body system. The paper illustrates the implementation of a constraint solver based on the multi-body perspective and its empirical evaluation on protein structure analysis problems

    High Performance Computing Techniques to Better Understand Protein Conformational Space

    Get PDF
    This thesis presents an amalgamation of high performance computing techniques to get better insight into protein molecular dynamics. Key aspects of protein function and dynamics can be learned from their conformational space. Datasets that represent the complex nuances of a protein molecule are high dimensional. Efficient dimensionality reduction becomes indispensable for the analysis of such exorbitant datasets. Dimensionality reduction forms a formidable portion of this work and its application has been explored for other datasets as well. It begins with the parallelization of a known non-liner feature reduction algorithm called Isomap. The code for the algorithm was re-written in C with portions of it parallelized using OpenMP. Next, a novel data instance reduction method was devised which evaluates the information content offered by each data point, which ultimately helps in truncation of the dataset with much fewer data points to evaluate. Once a framework has been established to reduce the number of variables representing a dataset, the work is extended to explore algebraic topology techniques to extract meaningful information from these datasets. This step is the one that helps in sampling the conformations of interest of a protein molecule. The method employs the notion of hierarchical clustering to identify classes within a molecule, thereafter, algebraic topology is used to analyze these classes. Finally, the work is concluded by presenting an approach to solve the open problem of protein folding. A Monte-Carlo based tree search algorithm is put forth to simulate the pathway that a certain protein conformation undertakes to reach another conformation. The dissertation, in its entirety, offers solutions to a few problems that hinder the progress of solution for the vast problem of understanding protein dynamics. The motion of a protein molecule is guided by changes in its energy profile. In this course the molecule gradually slips from one energy class to another. Structurally, this switch is transient spanning over milliseconds or less and hence is difficult to be captured solely by the work in wet laboratories

    Robotics-Inspired Methods for the Simulation of Conformational Changes in Proteins

    Get PDF
    Cette thèse présente une approche de modélisation inspirée par la robotique pour l'étude des changements conformationnels des protéines. Cette approche est basée sur une représentation mécanistique des protéines permettant l'application de méthodes efficaces provenant du domaine de la robotique. Elle fournit également une méthode appropriée pour le traitement gros-grains des protéines sans perte de détail au niveau atomique. L'approche présentée dans cette thèse est appliquée à deux types de problèmes de simulation moléculaire. Dans le premier, cette approche est utilisée pour améliorer l'échantillonnage de l'espace conformationnel des protéines. Plus précisément, cette approche de modélisation est utilisée pour implémenter des classes de mouvements pour l'échantillonnage, aussi bien connues que nouvelles, ainsi qu'une stratégie d'échantillonnage mixte, dans le contexte de la méthode de Monte Carlo. Les résultats des simulations effectuées sur des protéines ayant des topologies différentes montrent que cette stratégie améliore l'échantillonnage, sans toutefois nécessiter de ressources de calcul supplémentaires. Dans le deuxième type de problèmes abordés ici, l'approche de modélisation mécanistique est utilisée pour implémenter une méthode inspirée par la robotique et appliquée à la simulation de mouvements de grande amplitude dans les protéines. Cette méthode est basée sur la combinaison de l'algorithme RRT (Rapidly-exploring Random Tree) avec l'analyse en modes normaux, qui permet une exploration efficace des espaces de dimension élevée tels les espaces conformationnels des protéines. Les résultats de simulations effectuées sur un ensemble de protéines montrent l'efficacité de la méthode proposée pour l'étude des transitions conformationnellesProteins are biological macromolecules that play essential roles in living organisms. Un- derstanding the relationship between protein structure, dynamics and function is indis- pensable for advances in fields such as biology, pharmacology and biotechnology. Study- ing this relationship requires a combination of experimental and computational methods, whose development is the object of very active interdisciplinary research. In such a context, this thesis presents a robotics-inspired modeling approach for studying confor- mational changes in proteins. This approach is based on a mechanistic representation of proteins that enables the application of efficient methods originating from the field of robotics. It also provides an accurate method for coarse-grained treatment of proteins without loosing full-atom details.The presented approach is applied in this thesis to two different molecular simulation problems. First, the approach is used to enhance sampling of the conformational space of proteins using the Monte Carlo method. The modeling approach is used to implement new and known Monte Carlo trial move classes as well as a mixed sampling strategy. Results of simulations performed on proteins with different topologies show that this strategy enhances sampling without demanding higher computational resources. In the second problem tackled in this thesis, the mechanistic modeling approach is used to implement a robotics-inspired method for simulating large amplitude motions in proteins. This method is based on the combination of the Rapidly-exploring Random Tree (RRT) algorithm with Normal Mode Analysis (NMA), which allows efficient exploration of the high dimensional conformational spaces of proteins. Results of simulations performed on ten different proteins of different sizes and topologies show the effectiveness of the proposed method for studying conformational transitionsTOULOUSE-INSA-Bib. electronique (315559905) / SudocSudocFranceF

    Characterizing RNA ensembles from NMR data with kinematic models

    Get PDF
    International audienceFunctional mechanisms of biomolecules often manifest themselves precisely in transient conformational substates. Researchers have long sought to structurally characterize dynamic processes in non-coding RNA, combining experimental data with computer algorithms. However, adequate exploration of conformational space for these highly dynamic molecules, starting from static crystal structures, remains challenging. Here, we report a new conformational sampling procedure, KGSrna, which can efficiently probe the native ensemble of RNA molecules in solution. We found that KGSrna ensembles accurately represent the conformational landscapes of 3D RNA encoded by NMR proton chemical shifts. KGSrna resolves motionally averaged NMR data into structural contributions; when coupled with residual dipolar coupling data, a KGSrna ensemble revealed a previously uncharacterized transient excited state of the HIV-1 trans-activation response element stem-loop. Ensemble-based interpretations of averaged data can aid in formulating and testing dynamic, motion-based hypotheses of functional mechanisms in RNAs with broad implications for RNA engineering and therapeutic intervention

    Rapid Sampling of Molecular Motions with Prior Information Constraints

    Get PDF
    Proteins are active, flexible machines that perform a range of different functions. Innovative experimental approaches may now provide limited partial information about conformational changes along motion pathways of proteins. There is therefore a need for computational approaches that can efficiently incorporate prior information into motion prediction schemes. In this paper, we present PathRover, a general setup designed for the integration of prior information into the motion planning algorithm of rapidly exploring random trees (RRT). Each suggested motion pathway comprises a sequence of low-energy clash-free conformations that satisfy an arbitrary number of prior information constraints. These constraints can be derived from experimental data or from expert intuition about the motion. The incorporation of prior information is very straightforward and significantly narrows down the vast search in the typically high-dimensional conformational space, leading to dramatic reduction in running time. To allow the use of state-of-the-art energy functions and conformational sampling, we have integrated this framework into Rosetta, an accurate protocol for diverse types of structural modeling. The suggested framework can serve as an effective complementary tool for molecular dynamics, Normal Mode Analysis, and other prevalent techniques for predicting motion in proteins. We applied our framework to three different model systems. We show that a limited set of experimentally motivated constraints may effectively bias the simulations toward diverse predicates in an outright fashion, from distance constraints to enforcement of loop closure. In particular, our analysis sheds light on mechanisms of protein domain swapping and on the role of different residues in the motion
    corecore