409 research outputs found

    An algorithm to enumerate all possible protein conformations verifying a set of distance constraints

    Get PDF
    International audienceBackground: The determination of protein structures satisfying distance constraints is an important problem in structural biology. Whereas the most common method currently employed is simulated annealing, there have been other methods previously proposed in the literature. Most of them, however, are designed to find one solution only. Results: In order to explore exhaustively the feasible conformational space, we propose here an interval Branch-and-Prune algorithm (iBP) to solve the Distance Geometry Problem (DGP) associated to protein structure determination. This algorithm is based on a discretization of the problem obtained by recursively constructing a search space having the structure of a tree, and by verifying whether the generated atomic positions are feasible or not by making use of pruning devices. The pruning devices used here are directly related to features of protein conformations. Conclusions: We described the new algorithm iBP to generate protein conformations satisfying distance constraints, that would potentially allows a systematic exploration of the conformational space. The algorithm iBP has been applied on three α-helical peptides

    Complete Configuration Space Analysis for Structure Determination of Symmetric Homo-oligomers by NMR

    Get PDF
    Symmetric homo-oligomers (protein complexes with similar subunits arranged symmetrically) play pivotal roles in complex biological processes such as ion transport and cellular regulation. Structure determination of these complexes is necessary in order to gain valuable insights into their mechanisms. Nuclear Magnetic Resonance (NMR) spectroscopy is an experimental technique used for structural studies of such complexes. The data available for structure determination of symmetric homo-oligomers by NMR is often sparse and ambiguous in nature, raising concerns about existing heuristic approaches for structure determination. We have developed an approach that is complete in that it identifies all consistent conformations, data-driven in that it separately evaluates the consistency of structures to data and biophysical constraints and efficient in that it avoids explicit consideration of each of the possible structures separately. By being complete, we ensure that native conformations are not missed. By being data-driven, we are able to separately quantify the information content in the data alone versus data and biophysical modeling. We take a configuration space (degree-of-freedom) approach that provides a compact representation of the conformation space and enables us to efficiently explore the space of possible conformations. This thesis demonstrates that the configuration space-based method is robust to sparsity and ambiguity in the data and enables complete, data-driven and efficient structure determination of symmetric homo-oligomers

    nD − PDPA: n Dimensional Probability Density Profile Analysis

    Get PDF
    Proteins are often referred as working molecule of a cell, performing many structural, functional and regulatory processes. Revealing the function of proteins still remains a challenging problem. Advancement in genomics sequence projects produces large protein sequence repository, but due to technical difficulty and cost related to structure determination, the number of identified protein structure is far behind. Novel structures identification are particularly important for a number of reasons: they generate models of similar proteins for comparison; identify evolutionary relationships; further contribute to our understanding of protein function and mechanism; and allow for the fold of other family members to be inferred. Considering the evolutionary mechanisms responsible for the generation of new structures in proteins, it has been speculated that there may be a limited number of unique protein folds as few as ten thousand families. Currently, the Protein Data Bank consists of nearly 113,000 protein structures, but less than 1,500 families are represented, and almost no new fold families have been reported since 2008. Ideally, solved protein structures for new protein families would be used as templates for in silico structure prediction methods, and the results of both solved and predicted structures would in turn be used to infer function. However, such an approach requires new, efficient and cost-effective computational methods for target selection and structure determination. Traditional characterization of a protein structure by NMR spectroscopy is expensive and time consuming regardless of the structural novelty of the target protein. In an effort to expand the applicability of NMR spectroscopy, the community is continually focused on the development of new and economical approaches that enable the study of more challenging, or structurally novel proteins. While many advances have been made in this regard, very little attention has been made on reducing the cost of structural characterization of routine proteins. Probability Density Profile Analysis (PDPA) has been previously introduced to directly addresses the economies of structure determination of routine proteins and subsequently, identification of novel structures from minimal sets of NMR data. The latest version of PDPA (2D-PDPA) has been successful in identifying the structural homologue of an unknown protein within a library of 1000 decoy structures. In order to further expand the selectivity and sensitivity of PDPA, incorporation of additional data is necessary. However, current PDPA approach is limited by its computational requirements, and its expansion to include additional data will render it computationally infeasible. Here we propose a new method and developments that eliminate PDPA’s computational limitations and allow inclusion of Residual Dipolar Coupling (RDC) data from multiple vector types in multiple alignment media. Additionally nD-PDPA will be used to refine an unknown protein to obtain closer structure to the native in terms of bb-rmsd

    Revisiting the planarity of nucleic acid bases: Pyramidilization at glycosidic nitrogen in purine bases is modulated by orientation of glycosidic torsion

    Get PDF
    We describe a novel, fundamental property of nucleobase structure, namely, pyramidilization at the N1/9 sites of purine and pyrimidine bases. Through a combined analyses of ultra-high-resolution X-ray structures of both oligonucleotides extracted from the Nucleic Acid Database and isolated nucleotides and nucleosides from the Cambridge Structural Database, together with a series of quantum chemical calculations, molecular dynamics (MD) simulations, and published solution nuclear magnetic resonance (NMR) data, we show that pyramidilization at the glycosidic nitrogen is an intrinsic property. This property is common to isolated nucleosides and nucleotides as well as oligonucleotides—it is also common to both RNA and DNA. Our analysis suggests that pyramidilization at N1/9 sites depends in a systematic way on the local structure of the nucleoside. Of note, the pyramidilization undergoes stereo-inversion upon reorientation of the glycosidic bond. The extent of the pyramidilization is further modulated by the conformation of the sugar ring. The observed pyramidilization is more pronounced for purine bases, while for pyrimidines it is negligible. We discuss how the assumption of nucleic acid base planarity can lead to systematic errors in determining the conformation of nucleotides from experimental data and from unconstrained MD simulations

    Characterizing RNA ensembles from NMR data with kinematic models

    Get PDF
    International audienceFunctional mechanisms of biomolecules often manifest themselves precisely in transient conformational substates. Researchers have long sought to structurally characterize dynamic processes in non-coding RNA, combining experimental data with computer algorithms. However, adequate exploration of conformational space for these highly dynamic molecules, starting from static crystal structures, remains challenging. Here, we report a new conformational sampling procedure, KGSrna, which can efficiently probe the native ensemble of RNA molecules in solution. We found that KGSrna ensembles accurately represent the conformational landscapes of 3D RNA encoded by NMR proton chemical shifts. KGSrna resolves motionally averaged NMR data into structural contributions; when coupled with residual dipolar coupling data, a KGSrna ensemble revealed a previously uncharacterized transient excited state of the HIV-1 trans-activation response element stem-loop. Ensemble-based interpretations of averaged data can aid in formulating and testing dynamic, motion-based hypotheses of functional mechanisms in RNAs with broad implications for RNA engineering and therapeutic intervention
    corecore