15 research outputs found

    Integrating systems biology models and biomedical ontologies

    Get PDF
    BACKGROUND: Systems biology is an approach to biology that emphasizes the structure and dynamic behavior of biological systems and the interactions that occur within them. To succeed, systems biology crucially depends on the accessibility and integration of data across domains and levels of granularity. Biomedical ontologies were developed to facilitate such an integration of data and are often used to annotate biosimulation models in systems biology. RESULTS: We provide a framework to integrate representations of in silico systems biology with those of in vivo biology as described by biomedical ontologies and demonstrate this framework using the Systems Biology Markup Language. We developed the SBML Harvester software that automatically converts annotated SBML models into OWL and we apply our software to those biosimulation models that are contained in the BioModels Database. We utilize the resulting knowledge base for complex biological queries that can bridge levels of granularity, verify models based on the biological phenomenon they represent and provide a means to establish a basic qualitative layer on which to express the semantics of biosimulation models. CONCLUSIONS: We establish an information flow between biomedical ontologies and biosimulation models and we demonstrate that the integration of annotated biosimulation models and biomedical ontologies enables the verification of models as well as expressive queries. Establishing a bi-directional information flow between systems biology and biomedical ontologies has the potential to enable large-scale analyses of biological systems that span levels of granularity from molecules to organisms

    Management and provision of computational models

    Get PDF
    Quantitative models of biological systems provide an understanding of chemical and biological phenomena based on their underlying mechanisms. Moreover, they can be used for example, to predict the behaviour of a system under given conditions or direct future experiments. This has made quantitative models the perfect tools to answer a variety of questions in the biological sciences and has lead to a steady growth of the number of published models.

To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. BioModels Database(http://www.ebi.ac.uk/biomodels/) has been developed to exactly fulfil those needs. In order to ensure the correctness of the models distributed, their structure and behaviour are thoroughly checked. To ease their understanding, the model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Finally, to allow their reuse, the models are provided encoded in community supported and standardised formats.

However, the modelling field is constantly evolving and data providers, like BioModels Database, are faced with new challenges. For example, models are getting more and more complex (with for instance the availability of whole organism metabolic network reconstructions) and this has a direct impact on the performance of hosting infrastructures and annotation procedures. Also, models are now being developed collaboratively: this requires new methodologies and systems, akin to the ones used in software development (with for example versioned repositories of models). Moreover, very different kinds of models are being developed by diverse communities, but ultimately their data management needs are very similar.

This talk will introduce the needs which lead to the development of BioModels Database, present the resource and its current infrastructure and finally discuss the challenges that we are facing today and the plans to overcome them

    OREMPdb: a semantic dictionary of computational pathway models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The information coming from biomedical ontologies and computational pathway models is expanding continuously: research communities keep this process up and their advances are generally shared by means of dedicated resources published on the web. In fact, such models are shared to provide the characterization of molecular processes, while biomedical ontologies detail a semantic context to the majority of those pathways. Recent advances in both fields pave the way for a scalable information integration based on aggregate knowledge repositories, but the lack of overall standard formats impedes this progress. Indeed, having different objectives and different abstraction levels, most of these resources "speak" different languages. Semantic web technologies are here explored as a means to address some of these problems.</p> <p>Methods</p> <p>Employing an extensible collection of interpreters, we developed OREMP (Ontology Reasoning Engine for Molecular Pathways), a system that abstracts the information from different resources and combines them together into a coherent ontology. Continuing this effort we present OREMPdb; once different pathways are fed into OREMP, species are linked to the external ontologies referred and to reactions in which they participate. Exploiting these links, the system builds species-sets, which encapsulate species that operate together. Composing all of the reactions together, the system computes all of the reaction paths from-and-to all of the species-sets.</p> <p>Results</p> <p>OREMP has been applied to the curated branch of BioModels (2011/04/15 release) which overall contains 326 models, 9244 reactions, and 5636 species. OREMPdb is the semantic dictionary created as a result, which is made of 7360 species-sets. For each one of these sets, OREMPdb links the original pathway and the link to the original paper where this information first appeared. </p

    The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery

    Full text link

    Scientific discovery as a combinatorial optimisation problem: How best to navigate the landscape of possible experiments?

    Get PDF
    A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes

    Quantitative evaluation of ontology design patterns for combining pathology and anatomy ontologies.

    Get PDF
    Data are increasingly annotated with multiple ontologies to capture rich information about the features of the subject under investigation. Analysis may be performed over each ontology separately, but recently there has been a move to combine multiple ontologies to provide more powerful analytical possibilities. However, it is often not clear how to combine ontologies or how to assess or evaluate the potential design patterns available. Here we use a large and well-characterized dataset of anatomic pathology descriptions from a major study of aging mice. We show how different design patterns based on the MPATH and MA ontologies provide orthogonal axes of analysis, and perform differently in over-representation and semantic similarity applications. We discuss how such a data-driven approach might be used generally to generate and evaluate ontology design patterns.National Institutes of Health (AG038070-05, for the Shock Aging Center) King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. URF/1/3454-01-01 and FCC/1/1976-08-01. King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. FCS/1/3657-02-0

    Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology

    Get PDF
    Biomedical ontologies contain errors. Crowdsourcing, defined as taking a job traditionally performed by a designated agent and outsourcing it to an undefined large group of people, provides scalable access to humans. Therefore, the crowd has the potential overcome the limited accuracy and scalability found in current ontology quality assurance approaches. Crowd-based methods have identified errors in SNOMED CT, a large, clinical ontology, with an accuracy similar to that of experts, suggesting that crowdsourcing is indeed a feasible approach for identifying ontology errors. This work uses that same crowd-based methodology, as well as a panel of experts, to verify a subset of the Gene Ontology (200 relationships). Experts identified 16 errors, generally in relationships referencing acids and metals. The crowd performed poorly in identifying those errors, with an area under the receiver operating characteristic curve ranging from 0.44 to 0.73, depending on the methods configuration. However, when the crowd verified what experts considered to be easy relationships with useful definitions, they performed reasonably well. Notably, there are significantly fewer Google search results for Gene Ontology concepts than SNOMED CT concepts. This disparity may account for the difference in performance – fewer search results indicate a more difficult task for the worker. The number of Internet search results could serve as a method to assess which tasks are appropriate for the crowd. These results suggest that the crowd fits better as an expert assistant, helping experts with their verification by completing the easy tasks and allowing experts to focus on the difficult tasks, rather than an expert replacement

    EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data

    Get PDF
    Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes

    Recent advances in biomedical simulations: a manifesto for model engineering [version 1; referees: 3 approved]

    Get PDF
    Biomedical simulations are widely used to understand disease, engineer cells, and model cellular processes. In this article, we explore how to improve the quality of biomedical simulations by developing simulation models using tools and practices employed in software engineering. We refer to this direction as model engineering. Not all techniques used by software engineers are directly applicable to model engineering, and so some adaptations are required. That said, we believe that simulation models can benefit from software engineering practices for requirements, design, and construction as well as from software engineering tools for version control, error checking, and testing. Here we survey current efforts to improve simulation quality and discuss promising research directions for model engineering

    Towards Accessible, Usable Knowledge Frameworks in Engineering

    Get PDF
    A substantial amount of research has been done in the field of engineering knowledge management, where countless ontologies have been developed for various applications within the engineering community. However, despite the success shown in these research efforts, the techniques have not been adopted by industry. This research aims to uncover the reasons for the slow adoption of engineering knowledge frameworks, namely ontologies, in industry. There are two projects covered in this thesis. The first project is the development of a cross-domain ontology for the Biomesh Project, which spans the fields of mechanical engineering, biology, and anthropology. The biology community is known for its embrace of ontologies and has made their use quite popular with the creation of the Gene Ontology. This ontology spawned the establishment of the Open Biological and Biomedical Ontologies (OBO) Foundry, a consortium which approves and curates ontologies in the biology field. No such consortium exists in the field of engineering. This project demonstrates the usefulness of curated reference ontologies. Ontological knowledge bases in four different domains were imported and integrated together to connect previously disparate information. A case study with data from the Biomesh Project demonstrates cross-domain queries and inferences that were not possible before the creation of this ontology. In the second part of this thesis we investigate the usability of current ontology tools. Protégé, the most popular ontology editing tool, is compared to OntoWiki, a semantic wiki. This comparison is done using proven techniques from the field of Human-computer interaction to uncover usability problems and point out areas where each system excels. A field of 16 subjects completed a set of tasks in each system and gave feedback based on their experience. It is shown that while OntoWiki offers users a satisfying interface, it lacks in some areas that can be easily improved. Protégé provides users with adequate functionality, but it is not intended for a novice user