9 research outputs found

    Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies

    Get PDF
    The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers

    Semantics in Support of Biodiversity: An Introduction to the Biological Collections Ontology and Related Ontologies

    No full text
    The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers

    Metrics on current versions of the BCO, ENVO, and PCO.

    No full text
    1<p>. For BCO and PCO, the number of relations includes only relations that point to a BCO or PCO term, to adjust for the large proportion of imported terms.</p>2<p>. 39 imported from Basic Formal Ontology, 13 imported from Information Artifact Ontology, 10 imported from Ontology for Biomedical Investigations, 1 imported from Common Anatomy Reference Ontology.</p>3<p>. 172 imported from Chemical Entities of Biological Interest, 49 from Phenotypic Quality Ontology.</p>4<p>. 39 imported from Basic Formal Ontology, 1269 imported from Gene Ontology, 11 imported from Information Artifact Ontology, 2 imported from Common Anatomy Reference Ontology.</p

    Linking data across sites in the Genomic Observatories network's Ocean Sampling Day.

    No full text
    <p>(<b>A</b>) Ocean Sampling Day involves the simultaneous sampling of the world's oceans on a single day, as represented by the red stars on the map of the earth. Multiple ocean water sampling processes take place at each location. Those water samples are filtered to produce samples of organismal communities that are submitted to the bioarchive at the Smithsonian Institution. A subsample of the filtered material is analyzed to produce a metagenomic sequence, which may be stored in the Genomes Online Database (<a href="http://www.genomesonline.org/cgi-bin/GOLD/index.cgi" target="_blank">GOLD</a>). To be useful in comparative studies, data from each process at each location must be accessible and interpretable. (<b>B</b>) A graphical representation of how part of the workflow shown in <b>A</b> (from ocean water sampling to filtering to metagenomic sequencing) can be annotated with terms from multiple, coordinated ontologies and queried via an ontology-based data store. Ontology classes are shown as ovals and instances are shown as rectangles, with instances color-coded to match their parent classes. This figure shows how a metagenomic sequence and the taxa associated with it can be linked back to the original Ocean Sampling Day collecting event through a chain of inputs and outputs.</p

    Linking samples and derivatives from the Moorea Biocode project.

    No full text
    <p>(<b>A</b>) Biodiversity data from the Moorea Biocode project were collected at many different levels that are connected to one another in biologically meaningful ways, such as an Essig Museum specimen collected as part of a Biocode bioinventory event, a tissue sample submitted to the Smithsonian Institution, a metagenomic gut sample collected from the specimen and registered with the <a href="http://camera.calit2.net/" target="_blank">CAMERA portal</a>, or DNA extracted from either the tissue or metagenomic sample. (<b>B</b>) A graphical representation of how part of the workflow shown in <b>A</b> (from field collection to tissue sampling to DNA extraction) can be annotated with terms from multiple, coordinated ontologies and queried via an ontology-based data store. Ontology classes are shown as ovals and instances are shown as rectangles, with instances color-coded to match their parent classes. This figure shows how, for example, TaxonID B resulting from the BLAST identification process on Genbank sequence B can be linked back to the original Moorea Biocode sampling process, or how a chain of inputs and outputs can be used to infer that an instance of DNA molecules is derived from an instance of an insect specimen.</p

    Structured sampling schemes.

    No full text
    <p>(<b>A</b>) Biological sampling can be structured in both space and time. Environmental sampling of ocean water often includes sampling along a transect, with samples collected at multiple depths at each location. Additionally, each sample of water collected may be subsampled for metagenomic analysis or measuring chemical content. (<b>B</b>) Sampling schemes in ecological studies are often nested and may include plot; subplot or transect within plot; individual within plot, subplot, or transect; organ (e.g., leaf) within individual; tissue within organ; and DNA or mineral (e.g., C or N) within tissue. DNA extracted from a leaf of a tree that is present in a sub-plot may therefore be characterized by environmental features of the plot.</p

    Core terms of the Biological Collections Ontology (BCO) and their relations to upper ontologies.

    No full text
    <p>Core BCO terms (in orange) are subclasses of terms from the Basic Formal Ontology (BFO – in yellow) or the Ontology for Biomedical Investigations (OBI – in blue). For example, BCO:<i>material sample</i> is a subclass of BFO:<i>material entity</i> and has role BFO:<i>material sample role</i> (which is a BFO:<i>role</i>), while BFO:<i>material sampling process</i> is a subclass of OBI:<i>planned process</i>, and has as specified output BCO:<i>material sample</i>.</p

    Genomic Classification of Cutaneous Melanoma

    Get PDF
    We describe the landscape of genomic alterations in cutaneous melanomas through DNA, RNA, and protein-based analysis of 333 primary and/or metastatic melanomas from 331 patients. We establish a framework for genomic classification into one of four sub-types based on the pattern of the most prevalent significantly mutated genes: mutant BRAF, mutant RAS, mutant NF1, and Triple-WT (wild-type). Integrative analysis reveals enrichment of KIT mutations and focal amplifications and complex structural rearrangements as a feature of the Triple-WT subtype. We found no significant outcome correlation with genomic classification, but samples assigned a transcriptomic subclass enriched for immune gene expression associated with lymphocyte infiltrate on pathology review and high LCK protein expression, a T cell marker, were associated with improved patient survival. This clinicopathological and multidimensional analysis suggests that the prognosis of melanoma patients with regional metastases is influenced by tumor stroma immunobiology, offering insights to further personalize therapeutic decision-makingclose3
    corecore