7 research outputs found

    Deductive Biocomputing

    Get PDF
    BACKGROUND: As biologists increasingly rely upon computational tools, it is imperative that they be able to appropriately apply these tools and clearly understand the methods the tools employ. Such tools must have access to all the relevant data and knowledge and, in some sense, “understand” biology so that they can serve biologists' goals appropriately and “explain” in biological terms how results are computed. METHODOLOGY/PRINCIPAL FINDINGS: We describe a deduction-based approach to biocomputation that semiautomatically combines knowledge, software, and data to satisfy goals expressed in a high-level biological language. The approach is implemented in an open source web-based biocomputing platform called BioDeducta, which combines SRI's SNARK theorem prover with the BioBike interactive integrated knowledge base. The biologist/user expresses a high-level conjecture, representing a biocomputational goal query, without indicating how this goal is to be achieved. A subject domain theory, represented in SNARK's logical language, transforms the terms in the conjecture into capabilities of the available resources and the background knowledge necessary to link them together. If the subject domain theory enables SNARK to prove the conjecture—that is, to find paths between the goal and BioBike resources—then the resulting proofs represent solutions to the conjecture/query. Such proofs provide provenance for each result, indicating in detail how they were computed. We demonstrate BioDeducta by showing how it can approximately replicate a previously published analysis of genes involved in the adaptation of cyanobacteria to different light niches. CONCLUSIONS/SIGNIFICANCE: Through the use of automated deduction guided by a biological subject domain theory, this work is a step towards enabling biologists to conveniently and efficiently marshal integrated knowledge, data, and computational tools toward resolving complex biological queries

    Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm

    Get PDF
    Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering underlying rules, relationships, and patterns in data, has emerged as one of the most exciting fields in computational science. In this dissertation, we develop efficient approaches to the structure pattern analysis of RNA and protein three dimensional structures. The major techniques used in this work include term rewriting and clustering algorithms. Firstly, a new approach is designed to study the interaction of RNA secondary structures motifs using the concept of term rewriting. Secondly, an improved K-means clustering algorithm is proposed to estimate the number of clusters in data. A new distance descriptor is introduced for the appropriate representation of three dimensional structure segments of RNA and protein three dimensional structures. The experimental results show the improvements in the determination of the number of clusters in data, evaluation of RNA structure similarity, RNA structure database search, and better understanding of the protein sequence-structure correspondence

    Computational Analysis of Biological networks

    Get PDF
    Caratterizzare, descrivere ed estrarre informazioni da un network, \ue8 sicuramente uno dei principali obbiettivi della scienza, dato che lo studio dei network interessa differenti campi della ricerca, come la biologia, l'economia, le scienze sociali, l'informatica e cos\uec via. Ci\uf2 che si vuole \ue8 riuscire ad estrarre le propriet\ue0 fondamentali dei network e comprenderne la funzionalit\ue0. Questa tesi riguarda sia l'analisi topologica che l' analisi dinamica dei network biologici, anche se i risultati possono essere applicati a diversi campi. Per quanto riguarda l'analisi topologica viene utilizzato un approccio orientato ai nodi, utilizzando le centralit\ue0 per individuare i nodi pi\uf9 rilevanti e integrando tali risultati con dati da laboratorio. Viene inoltre descritto CentiScaPe, un software implementato per effettuare tale tipo di analisi. Vengono inoltre introdotti i concetti di "interference" e "robustness" che permettono di comprendere come un network si riarrangia in seguito alla rimozione o all'aggiunta di nodi. Per quanto riguarda l'analisi dinamica, si mostra come l'abstract interpretation pu\uf2 essere utilizzata nella simulazione di pathways per ottenere i risultati di migliaia di simulazioni in breve tempo e come possibile soluzione del problema della stima dei parametri mancanti.This thesis, treating both topological and dynamic points of view, concerns several aspects of biological networks analysis. Regarding the topological analysis of biological networks, the main contribution is the node-oriented point of view of the analysis. It means that instead of concentrating on global properties of the networks, we analyze them in order to extract properties of single nodes. An excellent method to face this problem is to use node centralities. Node centralities allow to identify nodes in a network having a relevant role in the network structure. This can not be enough if we are dealing with a biological network, since the role of a protein depends also on its biological activity that can be detected with lab experiments. Our approach is to integrate centralities analysis and data from biological experiments. A protocol of analysis have been produced, and the CentiScaPe tool for computing network centralities and integrating topological analysis with biological data have been designed and implemented. CentiScaPe have been applied to a human kino-phosphatome network and according to our protocol, kinases and phosphatases with highest centralities values have been extracted creating a new subnetwork of most central kinases and phosphatases. A lab experiment established which of this proteins presented high activation level and through CentiScaPe the proteins with both high centrality values and high activation level have been easily identified. The notion of node centralities interference have also been introduced to deal with central role of nodes in a biological network. It allow to identify which are the nodes that are more affected by the remotion of a particular node measuring the variation on their centralities values when such a node is removed from the network. The application of node centralities interference to the human kino-phosphatome revealed that different proteins affect centralities values of different nodes. Similarly to node centralities interference, the notion of centrality robustness of a node is introduced. This notion reveals if the central role of a node depends on other particular nodes in the network or if the node is ``robust'' in the sense that even if we remove or add other nodes the central role of the node remains almost unchanged. The dynamic aspects of biological networks analysis have been treated from an abstract interpretation point of view. Abstract interpretation is a powerful framework for the analysis of software and is excellent in deriving numerical properties of programs. Dealing with pathways, abstract interpretation have been adapted to the analysis of pathways simulation. Intervals domain and constants domain have been succesfully used to automatically extract information about reactants concentration. The intervals domain allow to determine the range of concentration of the proteins, and the constants domain have been used to know if a protein concentration become constant after a certain time. The other domain of analysis used is the congruences domain that, if applied to pathways simulation can easily identify regular oscillating behaviour in reactants concentration. The use of abstract interpretation allows to execute thousands of simulation and to completely and automatically characterize the behaviour of the pathways. In such a way it can be used also to solve the problem of parameters estimation where missing parameters can be detected with a brute force algorithm combined with the abstract interpretation analysis. The abstract interpretation approach have been succesfully applied to the mitotic oscillator pathway, characterizing the behaviour of the pathway depending on some reactants. To help the analysis of relation between reactants in the network, the notions of variables interference and variables abstract interference have been introduced and adapted to biological pathways simulation. They allow to find relations between properties of different reactants of the pathway. Using the abstract interference techniques we can say, for instance, which range of concentration of a protein can induce an oscillating behaviour of the pathway

    PATHWAY LOGIC MODELING OF PROTEIN FUNCTIONAL DOMAINS IN SIGNAL TRANSDUCTION

    No full text

    Twenty years of rewriting logic

    Get PDF
    AbstractRewriting logic is a simple computational logic that can naturally express both concurrent computation and logical deduction with great generality. This paper provides a gentle, intuitive introduction to its main ideas, as well as a survey of the work that many researchers have carried out over the last twenty years in advancing: (i) its foundations; (ii) its semantic framework and logical framework uses; (iii) its language implementations and its formal tools; and (iv) its many applications to automated deduction, software and hardware specification and verification, security, real-time and cyber-physical systems, probabilistic systems, bioinformatics and chemical systems
    corecore