11,872 research outputs found

    A new procedure to analyze RNA non-branching structures

    Get PDF
    RNA structure prediction and structural motifs analysis are challenging tasks in the investigation of RNA function. We propose a novel procedure to detect structural motifs shared between two RNAs (a reference and a target). In particular, we developed two core modules: (i) nbRSSP_extractor, to assign a unique structure to the reference RNA encoded by a set of non-branching structures; (ii) SSD_finder, to detect structural motifs that the target RNA shares with the reference, by means of a new score function that rewards the relative distance of the target non-branching structures compared to the reference ones. We integrated these algorithms with already existing software to reach a coherent pipeline able to perform the following two main tasks: prediction of RNA structures (integration of RNALfold and nbRSSP_extractor) and search for chains of matches (integration of Structator and SSD_finder)

    The EM Algorithm and the Rise of Computational Biology

    Get PDF
    In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the "central dogma"; of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Understanding Hydrogen-Bond Patterns in Proteins using a Novel Statistical Model

    Get PDF
    Proteins are built from basic structural elements and their systematic characterization is of interest. Searching for recurring patterns in protein contact maps, we found several network motifs, patterns that occur more frequently in experimentally determined protein contact maps than in randomized contact maps with the same properties. Some of these network motifs correspond to sub-structures of alpha helices, including topologies not previously recognized in this context. Other motifs characterize beta-sheets, again some of which appear to be novel. This topological characterization of patterns serves as a tool to characterize proteins, and to reveal a high detailed differences map for comparing protein structures solved by X-ray crystallography, NMR and molecular dynamics (MD) simulations. Both NMR and MD show small but consistent differences from the crystal structures of the same proteins, possibly due to the pair-wise energy functions used. Network motifs analysis can serve as a base for many-body energy statistical energy potential, and suggests a dictionary of basic elements of which protein secondary structure is made

    Kernel methods in genomics and computational biology

    Full text link
    Support vector machines and kernel methods are increasingly popular in genomics and computational biology, due to their good performance in real-world applications and strong modularity that makes them suitable to a wide range of problems, from the classification of tumors to the automatic annotation of proteins. Their ability to work in high dimension, to process non-vectorial data, and the natural framework they provide to integrate heterogeneous data are particularly relevant to various problems arising in computational biology. In this chapter we survey some of the most prominent applications published so far, highlighting the particular developments in kernel methods triggered by problems in biology, and mention a few promising research directions likely to expand in the future
    • …
    corecore