Search CORE

19,736 research outputs found

Ontology-based knowledge representation of experiment metadata in biological data mining

Author: Burke Squires
Carl Dahlke
Hagler Herb
Herb Hagler
Jamie Lee
Jeff Wiser
Jennifer Cai
Karp David
Megan Kong
Patrick Dunn
Richard Scheuermann
Smith Barry
Yu Qian
Publication venue
Publication date: 01/01/2009
Field of study

According to the PubMed resource from the U.S. National Library of Medicine, over 750,000 scientific articles have been published in the ~5000 biomedical journals worldwide in the year 2007 alone. The vast majority of these publications include results from hypothesis-driven experimentation in overlapping biomedical research domains. Unfortunately, the sheer volume of information being generated by the biomedical research enterprise has made it virtually impossible for investigators to stay aware of the latest findings in their domain of interest, let alone to be able to assimilate and mine data from related investigations for purposes of meta-analysis. While computers have the potential for assisting investigators in the extraction, management and analysis of these data, information contained in the traditional journal publication is still largely unstructured, free-text descriptions of study design, experimental application and results interpretation, making it difficult for computers to gain access to the content of what is being conveyed without significant manual intervention. In order to circumvent these roadblocks and make the most of the output from the biomedical research enterprise, a variety of related standards in knowledge representation are being developed, proposed and adopted in the biomedical community. In this chapter, we will explore the current status of efforts to develop minimum information standards for the representation of a biomedical experiment, ontologies composed of shared vocabularies assembled into subsumption hierarchical structures, and extensible relational data models that link the information components together in a machine-readable and human-useable framework for data mining purposes

PhilPapers

Automated Annotation-Based Bio-Ontology Alignment with Structural Validation

Author: Amanda White
Antonio Sanfilippo
Bob Baddeley
Carol Bult
Cliff Joslyn
Cliff Joslyn
Judith Blake
Karin Rodland
Mary Dolan
Rick Riensche
Publication venue
Publication date: 01/01/2009
Field of study

We outline the structure of an automated process to both align multiple bio-ontologies in terms of their genomic co-annotations, and then to measure the structural quality of that alignment. We illustrate the method with a genomic analysis of 70 genes implicated in lung disease against the Gene Ontology

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Nature Precedings

Representation of probabilistic scientific knowledge

Author: De Grave K
King RD
Rzhetsky A
Soldatova LN
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This article is available through the Brunel Open Access Publishing Fund. Copyright © 2013 Soldatova et al; licensee BioMed Central Ltd.The theory of probability is widely used in biomedical research for data analysis and modelling. In previous work the probabilities of the research hypotheses have been recorded as experimental metadata. The ontology HELO is designed to support probabilistic reasoning, and provides semantic descriptors for reporting on research that involves operations with probabilities. HELO explicitly links research statements such as hypotheses, models, laws, conclusions, etc. to the associated probabilities of these statements being true. HELO enables the explicit semantic representation and accurate recording of probabilities in hypotheses, as well as the inference methods used to generate and update those hypotheses. We demonstrate the utility of HELO on three worked examples: changes in the probability of the hypothesis that sirtuins regulate human life span; changes in the probability of hypotheses about gene functions in the S. cerevisiae aromatic amino acid pathway; and the use of active learning in drug design (quantitative structure activity relation learning), where a strategy for the selection of compounds with the highest probability of improving on the best known compound was used. HELO is open source and available at https://github.com/larisa-soldatova/HELO.This work was partially supported by grant BB/F008228/1 from the UK Biotechnology & Biological Sciences Research Council, from the European Commission under the FP7 Collaborative Programme, UNICELLSYS, KU Leuven GOA/08/008 and ERC Starting Grant 240186

Lirias

Springer - Publisher Connector

PubMed Central

Brunel University Research Archive

Behavior change interventions: the potential of ontologies for advancing science and practice

Author: Ahern David
Bartlett Ellis Rebecca J.
Cole-Lewis Heather
Gibson Bryan
Hekler Eric B.
Hesse Bradford
Larsen Kai R.
Michie Susan
Moser Richard P.
Spruijt-Metz Donna
Yi Jean
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2017
Field of study

A central goal of behavioral medicine is the creation of evidence-based interventions for promoting behavior change. Scientific knowledge about behavior change could be more effectively accumulated using "ontologies." In information science, an ontology is a systematic method for articulating a "controlled vocabulary" of agreed-upon terms and their inter-relationships. It involves three core elements: (1) a controlled vocabulary specifying and defining existing classes; (2) specification of the inter-relationships between classes; and (3) codification in a computer-readable format to enable knowledge generation, organization, reuse, integration, and analysis. This paper introduces ontologies, provides a review of current efforts to create ontologies related to behavior change interventions and suggests future work. This paper was written by behavioral medicine and information science experts and was developed in partnership between the Society of Behavioral Medicine's Technology Special Interest Group (SIG) and the Theories and Techniques of Behavior Change Interventions SIG. In recent years significant progress has been made in the foundational work needed to develop ontologies of behavior change. Ontologies of behavior change could facilitate a transformation of behavioral science from a field in which data from different experiments are siloed into one in which data across experiments could be compared and/or integrated. This could facilitate new approaches to hypothesis generation and knowledge discovery in behavioral science

IUPUIScholarWorks

Inferring gene ontologies from pairwise similarity data.

Author: Bafna Vineet
Dutkowski Janusz
Ideker Trey
Kramer Michael
Yu Michael
Publication venue: eScholarship, University of California
Publication date: 01/06/2014
Field of study

MotivationWhile the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene-gene pairwise similarities from -omics data; infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; and respect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge-none has been evaluated for GO inference.MethodsWe consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method's ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast.ResultsFor task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20-25% precision, recall).ConclusionThis study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data

PubMed Central

eScholarship - University of California

An ontology to standardize research output of nutritional epidemiology : from paper-based standards to linked content

Author: Ambayo Henry
Bouwman Jildau
Bronselaer Antoon
De Baets Bernard
Hawwash Dana
Kolsteren Patrick
Lachat Carl
Pattyn Filip
Thanintorn Nattapon
Yang Chen
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Background: The use of linked data in the Semantic Web is a promising approach to add value to nutrition research. An ontology, which defines the logical relationships between well-defined taxonomic terms, enables linking and harmonizing research output. To enable the description of domain-specific output in nutritional epidemiology, we propose the Ontology for Nutritional Epidemiology (ONE) according to authoritative guidance for nutritional epidemiology. Methods: Firstly, a scoping review was conducted to identify existing ontology terms for reuse in ONE. Secondly, existing data standards and reporting guidelines for nutritional epidemiology were converted into an ontology. The terms used in the standards were summarized and listed separately in a taxonomic hierarchy. Thirdly, the ontologies of the nutritional epidemiologic standards, reporting guidelines, and the core concepts were gathered in ONE. Three case studies were included to illustrate potential applications: (i) annotation of existing manuscripts and data, (ii) ontology-based inference, and (iii) estimation of reporting completeness in a sample of nine manuscripts. Results: Ontologies for food and nutrition (n = 37), disease and specific population (n = 100), data description (n = 21), research description (n = 35), and supplementary (meta) data description (n = 44) were reviewed and listed. ONE consists of 339 classes: 79 new classes to describe data and 24 new classes to describe the content of manuscripts. Conclusion: ONE is a resource to automate data integration, searching, and browsing, and can be used to assess reporting completeness in nutritional epidemiology

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Community standards for open cell migration data

Author: Ampe Christophe
Bakker Gert-Jan
Besson Sébastien
Eibl Robert H.
Friedl Peter
Gonzalez-Beltran Alejandra N.
Gunzer Matthias
Kittisopikul Mark
Le Dévédec Sylvia E.
Leo Simone
Martens Lennart
Masuzzo Paola
Moore Josh
Paran Yael
Prilusky Jaime
Rocca-Serra Philippe
Roudot Philippe
Sansone Susanna-Assunta
Schuster Marc
Sergeant Gwendolien
Strömblad Staffan
Swedlow Jason R.
van Erp Merijn
Van Troys Marleen
Zaritsky Assaf
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2020
Field of study

Cell migration research has become a high-content field. However, the quantitative information encapsulated in these complex and high-dimensional datasets is not fully exploited owing to the diversity of experimental protocols and non-standardized output formats. In addition, typically the datasets are not open for reuse. Making the data open and Findable, Accessible, Interoperable, and Reusable (FAIR) will enable meta-analysis, data integration, and data mining. Standardized data formats and controlled vocabularies are essential for building a suitable infrastructure for that purpose but are not available in the cell migration domain. We here present standardization efforts by the Cell Migration Standardisation Organisation (CMSO), an open community-driven organization to facilitate the development of standards for cell migration data. This work will foster the development of improved algorithms and tools and enable secondary analysis of public datasets, ultimately unlocking new knowledge of the complex biological process of cell migration

Ghent University Academic Bibliography

Leiden University Scholary Publications

ePubs: the open archive for STFC research publications

Juelich Shared Electronic Resources

University of Dundee Online Publications

Ontology (Science)

Author: Barry Smith
Publication venue
Publication date: 01/01/2008
Field of study

Increasingly, in data-intensive areas of the life sciences, experimental results are being described in algorithmically useful ways with the help of ontologies. Such ontologies are authored and maintained by scientists to support the retrieval, integration and analysis of their data. The proposition to be defended here is that ontologies of this type – the Gene Ontology (GO) being the most conspicuous example – are a _part of science_. Initial evidence for the truth of this proposition (which some will find self-evident) is the increasing recognition of the importance of empirically-based methods of evaluation to the ontology develop¬ment work being undertaken in support of scientific research. Ontologies created by scientists must, of course, be associated with implementations satisfying the requirements of software engineering. But the ontologies are not themselves engineering artifacts, and to conceive them as such brings grievous consequences. Rather, ontologies such as the GO are in different respects comparable to scientific theories, to scientific databases, and to scientific journal publications. Such a view implies a new conception of what is involved in the author¬ing, maintenance and application of ontologies in scientific contexts, and therewith also a new approach to the evaluation of ontologies and to the training of ontologists

PhilPapers

Nature Precedings

COLOMBOS v2.0 : an ever expanding collection of bacterial expression compendia

Author: Bianco Luca
Collado-Vides Julio
Engelen Kristof
Fu Qiang
Gama-Castro Socorro
Laukens Kris
Ledezma-Tejeida Daniela
Liebens Veerle
Marchal Kathleen
Meysman Pieter
Michiels Jan
Sonego Paolo
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/11/2013
Field of study

The COLOMBOS database (http://www.colombos.net) features comprehensive organism-specific cross-platform gene expression compendia of several bacterial model organisms and is supported by a fully interactive web portal and an extensive web API. COLOMBOS was originally published in PLoS One, and COLOMBOS v2.0 includes both an update of the expression data, by expanding the previously available compendia and by adding compendia for several new species, and an update of the surrounding functionality, with improved search and visualization options and novel tools for programmatic access to the database. The scope of the database has also been extended to incorporate RNA-seq data in our compendia by a dedicated analysis pipeline. We demonstrate the validity and robustness of this approach by comparing the same RNA samples measured in parallel using both microarrays and RNA-seq. As far as we know, COLOMBOS currently hosts the largest homogenized gene expression compendia available for seven bacterial model organisms

Ghent University Academic Bibliography

PubMed Central

Establishment of a integrative multi-omics expression database CKDdb in the context of chronic kidney disease (CKD)

Author: Fernandes Marco
Husi Holger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Complex human traits such as chronic kidney disease (CKD) are a major health and financial burden in modern societies. Currently, the description of the CKD onset and progression at the molecular level is still not fully understood. Meanwhile, the prolific use of high-throughput omic technologies in disease biomarker discovery studies yielded a vast amount of disjointed data that cannot be easily collated. Therefore, we aimed to develop a molecule-centric database featuring CKD-related experiments from available literature publications. We established the Chronic Kidney Disease database CKDdb, an integrated and clustered information resource that covers multi-omic studies (microRNAs, genomics, peptidomics, proteomics and metabolomics) of CKD and related disorders by performing literature data mining and manual curation. The CKDdb database contains differential expression data from 49395 molecule entries (redundant), of which 16885 are unique molecules (non-redundant) from 377 manually curated studies of 230 publications. This database was intentionally built to allow disease pathway analysis through a systems approach in order to yield biological meaning by integrating all existing information and therefore has the potential to unravel and gain an in-depth understanding of the key molecular events that modulate CKD pathogenesis

PubMed Central

Enlighten