43 research outputs found
JAMI: a Java library for molecular interactions and data interoperability.
BACKGROUND: A number of different molecular interactions data download formats now exist, designed to allow access to these valuable data by diverse user groups. These formats include the PSI-XML and MITAB standard interchange formats developed by Molecular Interaction workgroup of the HUPO-PSI in addition to other, use-specific downloads produced by other resources. The onus is currently on the user to ensure that a piece of software is capable of read/writing all necessary versions of each format. This problem may increase, as data providers strive to meet ever more sophisticated user demands and data types. RESULTS: A collaboration between EMBL-EBI and the University of Cambridge has produced JAMI, a single library to unify standard molecular interaction data formats such as PSI-MI XML and PSI-MITAB. The JAMI free, open-source library enables the development of molecular interaction computational tools and pipelines without the need to produce different versions of software to read different versions of the data formats. CONCLUSION: Software and tools developed on top of the JAMI framework are able to integrate and support both PSI-MI XML and PSI-MITAB. The use of JAMI avoids the requirement to chain conversions between formats in order to reach a desired output format and prevents code and unit test duplication as the code becomes more modular. JAMI's model interfaces are abstracted from the underlying format, hiding the complexity and requirements of each data format from developers using JAMI as a library
Recommended from our members
An Ontology and Semantic Web Service for Quantum Chemistry Calculations.
The purpose of this article is to present an ontology, termed OntoCompChem, for quantum chemistry calculations as performed by the Gaussian quantum chemistry software, as well as a semantic web service named MolHub. The OntoCompChem ontology has been developed based on the semantics of concepts specified in the CompChem convention of Chemical Markup Language (CML) and by extending the Gainesville Core (GNVC) ontology. MolHub is developed in order to establish semantic interoperability between different tools used in quantum chemistry and thermochemistry calculations, and as such is integrated into the J-Park Simulator (JPS)-a multidomain interactive simulation platform and expert system. It uses the OntoCompChem ontology and implements a formal language based on propositional logic as a part of its query engine, which verifies satisfiability through reasoning. This paper also presents a NASA polynomial use-case scenario to demonstrate semantic interoperability between Gaussian and a tool for thermodynamic data calculations within MolHub.This project is supported by the National Research Foundation (NRF), Prime Ministerâs Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme, and by the Alexander von Humboldt foundation
CEDAR, an online resource for the reporting and exploration of complexome profiling data
Complexome profiling is an emerging âomicsâ approach that systematically interrogates the composition of protein complexes (the complexome) of a sample, by combining biochemical separation of native protein complexes with mass-spectrometry based quantitation proteomics. The resulting fractionation profiles hold comprehensive information on the abundance and composition of the complexome, and have a high potential for reuse by experimental and computational researchers. However, the lack of a central resource that provides access to these data, reported with adequate descriptions and an analysis tool, has limited their reuse. Therefore, we established the ComplexomE profiling DAta Resource (CEDAR, www3.cmbi.umcn.nl/cedar/), an openly accessible database for depositing and exploring mass spectrometry data from complexome profiling studies. Compatibility and reusability of the data is ensured by a standardized data and reporting format containing the âminimum information required for a complexome profiling experimentâ (MIACE). The data can be accessed through a user-friendly web interface, as well as programmatically using the REST API portal. Additionally, all complexome profiles available on CEDAR can be inspected directly on the website with the profile viewer tool that allows the detection of correlated profiles and inference of potential complexes. In conclusion, CEDAR is a unique, growing and invaluable resource for the study of protein complex composition and dynamics across biological systems
Enabling semantic queries across federated bioinformatics databases
MOTIVATION: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases.
RESULTS: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface
AgroPortal: a vocabulary and ontology repository for agronomy
Many vocabularies and ontologies are produced to represent and annotate agronomic data. However, those ontologies are spread out, in different formats, of different size, with different structures and from overlapping domains. Therefore, there is need for a common platform to receive and host them, align them, and enabling their use in agro-informatics applications. By reusing the National Center for Biomedical Ontologies (NCBO) BioPortal technology, we have designed AgroPortal, an ontology repository for the agronomy domain. The AgroPortal project re-uses the biomedical domainâs semantic tools and insights to serve agronomy, but also food, plant, and biodiversity sciences. We offer a portal that features ontology hosting, search, versioning, visualization, comment, and recommendation; enables semantic annotation; stores and exploits ontology alignments; and enables interoperation with the semantic web. The AgroPortal specifically satisfies requirements of the agronomy community in terms of ontology formats (e.g., SKOS vocabularies and trait dictionaries) and supported features (offering detailed metadata and advanced annotation capabilities). In this paper, we present our platformâs content and features, including the additions to the original technology, as well as preliminary outputs of five driving agronomic use cases that participated in the design and orientation of the project to anchor it in the community. By building on the experience and existing technology acquired from the biomedical domain, we can present in AgroPortal a robust and feature-rich repository of great value for the agronomic domain.
Keyword
Representation of behaviour change interventions and their evaluation: Development of the Upper Level of the Behaviour Change Intervention Ontology [version 2; peer review: 1 approved, 1 approved with reservations]
Background: Behaviour change interventions (BCI), their contexts and evaluation methods are heterogeneous, making it difficult to synthesise evidence and make recommendations for real-world policy and practice. Ontologies provide a means for addressing this. They represent knowledge formally as entities and relationships using a common language able to cross disciplinary boundaries and topic domains. This paper reports the development of the upper level of the Behaviour Change Intervention Ontology (BCIO), which provides a systematic way to characterise BCIs, their contexts and their evaluations.
Methods: Development took place in four steps. (1) Entities and relationships were identified by behavioural and social science experts, based on their knowledge of evidence and theory, and their practical experience of behaviour change interventions and evaluations. (2) The outputs of the first step were critically examined by a wider group of experts, including the study ontology expert and those experienced in annotating relevant literature using the initial ontology entities. The outputs of the second step were tested by (3) feedback from three external international experts in ontologies and (4) application of the prototype upper-level BCIO to annotating published reports; this informed the final development of the upper-level BCIO.
Results: The final upper-level BCIO specifies 42 entities, including the BCI scenario, elaborated across 21 entities and 7 relationship types, and the BCI evaluation study comprising 10 entities and 9 relationship types. BCI scenario entities include the behaviour change intervention (content and delivery), outcome behaviour, mechanism of action, and its context, which includes population and setting. These entities have corresponding entities relating to the planning and reporting of interventions and their evaluations.
Conclusions: The upper level of the BCIO provides a comprehensive and systematic framework for representing BCIs, their contexts and their evaluations.
Keyword
Recommended from our members
Phenotyping in the era of genomics: MaTricsâa digital character matrix to document mammalian phenotypic traits
A new and uniquely structured matrix of mammalian phenotypes, MaTrics (Mammalian Traits for Comparative Genomics) in a digital form is presented. By focussing on mammalian species for which genome assemblies are available, MaTrics provides an interface between mammalogy and comparative genomics.
MaTrics was developed within a project aimed to find genetic causes of phenotypic traits of mammals using Forward Genomics. This approach requires genomes and comprehensive and recorded information on homologous phenotypes that are coded as discrete categories in a matrix. MaTrics is an evolving online resource providing information on phenotypic traits in numeric code; traits are coded either as absent/present or with several states as multistate. The state record for each species is linked to at least one reference (e.g., literature, photographs, histological sections, CT scans, or museum specimens) and so MaTrics contributes to digitalization of museum collections. Currently, MaTrics covers 147 mammalian species and includes 231 characters related to structure, morphology, physiology, ecology, and ethology and available in a machine actionable NEXUS-format*. Filling MaTrics revealed substantial knowledge gaps, highlighting the need for phenotyping efforts. Studies based on selected data from MaTrics and using Forward Genomics identified associations between genes and certain phenotypes ranging from lifestyles (e.g., aquatic) to dietary specializations (e.g., herbivory, carnivory). These findings motivate the expansion of phenotyping in MaTrics by filling research gaps and by adding taxa and traits. Only databases like MaTrics will provide machine actionable information on phenotypic traits, an important limitation to genomics. MaTrics is available within the data repository Morph·D·Base (www.morphdbase.de)
Comparison, alignment, and synchronization of cell line information between CLO and EFO
Abstract
Background
The Experimental Factor Ontology (EFO) is an application ontology driven by experimental variables including cell lines to organize and describe the diverse experimental variables and data resided in the EMBL-EBI resources. The Cell Line Ontology (CLO) is an OBO community-based ontology that contains information of immortalized cell lines and relevant experimental components. EFO integrates and extends ontologies from the bio-ontology community to drive a number of practical applications. It is desirable that the community shares design patterns and therefore that EFO reuses the cell line representation from the Cell Line Ontology (CLO). There are, however, challenges to be addressed when developing a common ontology design pattern for representing cell lines in both EFO and CLO.
Results
In this study, we developed a strategy to compare and map cell line terms between EFO and CLO. We examined Cellosaurus resources for EFO-CLO cross-references. Text labels of cell lines from both ontologies were verified by biological information axiomatized in each source. The study resulted in the identification 873 EFO-CLO aligned and 344 EFO unique immortalized permanent cell lines. All of these cell lines were updated to CLO and the cell line related information was merged. A design pattern that integrates EFO and CLO was also developed.
Conclusion
Our study compared, aligned, and synchronized the cell line information between CLO and EFO. The final updated CLO will be examined as the candidate ontology to import and replace eligible EFO cell line classes thereby supporting the interoperability in the bio-ontology domain. Our mapping pipeline illustrates the use of ontology in aiding biological data standardization and integration through the biological and semantics content of cell lines.https://deepblue.lib.umich.edu/bitstream/2027.42/140391/1/12859_2017_Article_1979.pd
Ontologies relevant to behaviour change interventions: a method for their development.
Background: Behaviour and behaviour change are integral to many aspects of wellbeing and sustainability. However, reporting behaviour change interventions accurately and synthesising evidence about effective interventions is hindered by lacking a shared, scientific terminology to describe intervention characteristics. Ontologies are standardised frameworks that provide controlled vocabularies to help unify and connect scientific fields. To date, there is no published guidance on the specific methods required to develop ontologies relevant to behaviour change. We report the creation and refinement of a method for developing ontologies that make up the Behaviour Change Intervention Ontology (BCIO). Aims: (1) To describe the development method of the BCIO and explain its rationale; (2) To provide guidance on implementing the activities within the development method. Method and results: The method for developing ontologies relevant to behaviour change interventions was constructed by considering principles of good practice in ontology development and identifying key activities required to follow those principles. The method's details were refined through application to developing two ontologies. The resulting ontology development method involved: (1) defining the ontology's scope; (2) identifying key entities; (3) refining the ontology through an iterative process of literature annotation, discussion and revision; (4) expert stakeholder review; (5) testing inter-rater reliability; (6) specifying relationships between entities, and; (7) disseminating and maintaining the ontology. Guidance is provided for conducting relevant activities for each step. Conclusions: We have developed a detailed method for creating ontologies relevant to behaviour change interventions, together with practical guidance for each step, reflecting principles of good practice in ontology development. The most novel aspects of the method are the use of formal mechanisms for literature annotation and expert stakeholder review to develop and improve the ontology content. We suggest the mnemonic SELAR3, representing the method's first six steps as Scope, Entities, Literature Annotation, Review, Reliability, Relationships