23 research outputs found

    Nose to tail, roots to shoots: spatial descriptors for phenotypic diversity in the Biological Spatial Ontology

    Full text link

    Resolving Taxonomic Names using Evidence Extracted from Text

    Get PDF
    Biological taxonomy is established on organism relationships with scientific names as the primary identifiers; however, resolving various taxonomic names remains one of the greatest challenges in taxonomy and systematic biology overall. We proposed an evidence-based approach that extracts trait (character) evidence from published literature to facilitate the comparison of taxonomic concepts. In this poster, we report an initial set of results from our first case study using the plant genus Rubus. The case study tested the entire pipeline of the Explorer of Taxon Concepts toolkit we have developed and revealed challenging phenomena to be solved in the near future

    Developing New Tools for the Old Tree of Life

    Get PDF
    Millions of species reside in the Tree of Life, making the task of resolving the evolutionary origin of many organisms difficult. Biologists draw on genetic and phenotypic information to sort the Tree of Life, but the study can be slow and complex. Phenomic data (such as cell shape, metabolism and ecology), particularly for microorganisms, is often found in scientific publications and has little digital presence outside of being scanned into an online database. This has been aided by a new text mining computer program, MicroPIE (Microbial Phenomics Information Extractor), that sifts through relevant phenomic data and creates a matrix of key phenomic characters taken from the published descriptions. MicroPIE utilizes multiple natural language processing tools to extract data, along with the knowledge of microbiologists to help with developing and verifying the tools. One major challenge to building such a tool is the time it takes to collect and edit phenomic data for tens of thousands of sentences needed to develop a functioning program. We have helped to further the development of MicroPIE to identify new characteristics by providing sentences from published microbial descriptions. We also are creating a “Gold Standard” matrix (GSM) of phenomic information for 100 different bacteria that can then be compared to the MicroPIE output in order to test that MicroPIE has correctly identified and extracted phenomic information. So far MicroPIE has shown potential to aid in resolution of the microbial Tree of Life

    Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon.

    Get PDF
    BACKGROUND: Elucidating disease and developmental dysfunction requires understanding variation in phenotype. Single-species model organism anatomy ontologies (ssAOs) have been established to represent this variation. Multi-species anatomy ontologies (msAOs; vertebrate skeletal, vertebrate homologous, teleost, amphibian AOs) have been developed to represent 'natural' phenotypic variation across species. Our aim has been to integrate ssAOs and msAOs for various purposes, including establishing links between phenotypic variation and candidate genes. RESULTS: Previously, msAOs contained a mixture of unique and overlapping content. This hampered integration and coordination due to the need to maintain cross-references or inter-ontology equivalence axioms to the ssAOs, or to perform large-scale obsolescence and modular import. Here we present the unification of anatomy ontologies into Uberon, a single ontology resource that enables interoperability among disparate data and research groups. As a consequence, independent development of TAO, VSAO, AAO, and vHOG has been discontinued. CONCLUSIONS: The newly broadened Uberon ontology is a unified cross-taxon resource for metazoans (animals) that has been substantially expanded to include a broad diversity of vertebrate anatomical structures, permitting reasoning across anatomical variation in extinct and extant taxa. Uberon is a core resource that supports single- and cross-species queries for candidate genes using annotations for phenotypes from the systematics, biodiversity, medical, and model organism communities, while also providing entities for logical definitions in the Cell and Gene Ontologies. THE ONTOLOGY RELEASE FILES ASSOCIATED WITH THE ONTOLOGY MERGE DESCRIBED IN THIS MANUSCRIPT ARE AVAILABLE AT: http://purl.obolibrary.org/obo/uberon/releases/2013-02-21/ CURRENT ONTOLOGY RELEASE FILES ARE AVAILABLE ALWAYS AVAILABLE AT: http://purl.obolibrary.org/obo/uberon/releases

    Taxonomy and the Production of Semantic Phenotypes

    Full text link
    Preprint of chapter appearing in "Studies on the Semantic Web: Volume 33: Application of Semantic Technology in Biodiversity Science"Taxonomists produce a myriad of phenotypic descriptions. Traditionally these are provided in terse (telegraphic) natural language. As seen in parallel within other fields of biology researchers are exploring ways to formalize parts of the taxonomic process so that aspects of it are more computational in nature. The currently used data formalizations, mechanisms for persisting data, applications, and computing approaches related to the production of semantic descriptions (phenotypes) are reviewed, they, and their adopters are limited in number. In order to move forward we step back and characterize taxonomists with respect to their typical workflow and tendencies. We then use these characteristics as a basis for exploring how we might create software that taxonomists will find intuitive within their cur- rent workflows, providing interface examples as thought experiments.NSF - DBI-1356381NSF 0956049https://deepblue.lib.umich.edu/bitstream/2027.42/148811/1/yoder_proof.pdfDescription of yoder_proof.pdf : Proof of book chapte

    A Semantic Model for Species Description Applied to the Ensign Wasps (Hymenoptera: Evaniidae) of New Caledonia

    Get PDF
    Taxonomic descriptions are unparalleled sources of knowledge of life's phenotypic diversity. As natural language prose, these data sets are largely refractory to computation and integration with other sources of phenotypic data. By formalizing taxonomic descriptions using ontology-based semantic representation, we aim to increase the reusability and computability of taxonomists' primary data. Here, we present a revision of the ensign wasp (Hymenoptera: Evaniidae) fauna of New Caledonia using this new model for species description. Descriptive matrices, specimen data, and taxonomic nomenclature are gathered in a unified Web-based application, mx, then exported as both traditional taxonomic treatments and semantic statements using the OWL Web Ontology Language. Character:character-state combinations are then annotated following the entity–quality phenotype model, originally developed to represent mutant model organism phenotype data; concepts of anatomy are drawn from the Hymenoptera Anatomy Ontology and linked to phenotype descriptors from the Phenotypic Quality Ontology. The resulting set of semantic statements is provided in Resource Description Framework format. Applying the model to real data, that is, specimens, taxonomic names, diagnoses, descriptions, and redescriptions, provides us with a foundation to discuss limitations and potential benefits such as automated data integration and reasoner-driven queries. Four species of ensign wasp are now known to occur in New Caledonia: Szepligetella levipetiolata, Szepligetella deercreeki Deans and Mikó sp. nov., Szepligetella irwini Deans and Mikó sp. nov., and the nearly cosmopolitan Evania appendigaster. A fifth species, Szepligetella sericea, including Szepligetella impressa, syn. nov., has not yet been collected in New Caledonia but can be found on islands throughout the Pacific and so is included in the diagnostic key. [Biodiversity informatics; Evaniidae; New Caledonia; new species; ontology; semantic phenotypes; semantic species description; taxonomy.

    The flora phenotype ontology (FLOPO):tool for integrating morphological traits and phenotypes of vascular plants

    Get PDF
    Background: The systematic analysis of a large number of comparable plant trait data can support investigations into phylogenetics and ecological adaptation, with broad applications in evolutionary biology, agriculture, conservation, and the functioning of ecosystems. Floras, i.e., books collecting the information on all known plant species found within a region, are a potentially rich source of such plant trait data. Floras describe plant traits with a focus on morphology and other traits relevant for species identification in addition to other characteristics of plant species, such as ecological affinities, distribution, economic value, health applications, traditional uses, and so on. However, a key limitation in systematically analyzing information in Floras is the lack of a standardized vocabulary for the described traits as well as the difficulties in extracting structured information from free text. Results: We have developed the Flora Phenotype Ontology (FLOPO), an ontology for describing traits of plant species found in Floras. We used the Plant Ontology (PO) and the Phenotype And Trait Ontology (PATO) to extract entity-quality relationships from digitized taxon descriptions in Floras, and used a formal ontological approach based on phenotype description patterns and automated reasoning to generate the FLOPO. The resulting ontology consists of 25,407 classes and is based on the PO and PATO. The classified ontology closely follows the structure of Plant Ontology in that the primary axis of classification is the observed plant anatomical structure, and more specific traits are then classified based on parthood and subclass relations between anatomical structures as well as subclass relations between phenotypic qualities. Conclusions: The FLOPO is primarily intended as a framework based on which plant traits can be integrated computationally across all species and higher taxa of flowering plants. Importantly, it is not intended to replace established vocabularies or ontologies, but rather serve as an overarching framework based on which different application- and domain-specific ontologies, thesauri and vocabularies of phenotypes observed in flowering plants can be integrated
    corecore