13 research outputs found

    Developing New Tools for the Old Tree of Life

    Get PDF
    Millions of species reside in the Tree of Life, making the task of resolving the evolutionary origin of many organisms difficult. Biologists draw on genetic and phenotypic information to sort the Tree of Life, but the study can be slow and complex. Phenomic data (such as cell shape, metabolism and ecology), particularly for microorganisms, is often found in scientific publications and has little digital presence outside of being scanned into an online database. This has been aided by a new text mining computer program, MicroPIE (Microbial Phenomics Information Extractor), that sifts through relevant phenomic data and creates a matrix of key phenomic characters taken from the published descriptions. MicroPIE utilizes multiple natural language processing tools to extract data, along with the knowledge of microbiologists to help with developing and verifying the tools. One major challenge to building such a tool is the time it takes to collect and edit phenomic data for tens of thousands of sentences needed to develop a functioning program. We have helped to further the development of MicroPIE to identify new characteristics by providing sentences from published microbial descriptions. We also are creating a “Gold Standard” matrix (GSM) of phenomic information for 100 different bacteria that can then be compared to the MicroPIE output in order to test that MicroPIE has correctly identified and extracted phenomic information. So far MicroPIE has shown potential to aid in resolution of the microbial Tree of Life

    Resolving Taxonomic Names using Evidence Extracted from Text

    Get PDF
    Biological taxonomy is established on organism relationships with scientific names as the primary identifiers; however, resolving various taxonomic names remains one of the greatest challenges in taxonomy and systematic biology overall. We proposed an evidence-based approach that extracts trait (character) evidence from published literature to facilitate the comparison of taxonomic concepts. In this poster, we report an initial set of results from our first case study using the plant genus Rubus. The case study tested the entire pipeline of the Explorer of Taxon Concepts toolkit we have developed and revealed challenging phenomena to be solved in the near future

    Nose to tail, roots to shoots: spatial descriptors for phenotypic diversity in the Biological Spatial Ontology

    Full text link

    Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon

    Full text link

    Un enfoque semiautomático de extracción de conocimiento sobre biodiversidad a partir de descripciones textuales de especies botánicas

    Get PDF
    Reporte final del proyecto. Código del Proyecto: 5402-1375-4301Este documento describe el estado final del proyecto. Primero se introduce la gran necesidad que se tiene de poder acceder a información textual sobre biodiversidad de una manera más estructurada y semánticamente más significativa. Luego se recapitulan los principales enfoques que han sido usados para enfrentar dicho problema. Se enfatizan los enfoques que se refieren a la estructuración de descripciones morfológicas y de distribuciones geográficas, por ser estas las áreas de interés principal del proyecto. A continuación se presenta en detalle la organización del proyecto y sus tres etapas principales: recolección y transformación de documentos fuentes, estructuración semántica de fragmentos de texto de interés, y finalmente, desarrollo de herramientas para aprovechar la información estructurada. Luego se presentan los resultados obtenidos por el proyecto: resultados y evaluaciones obtenidos en la estructuración semántica de descripciones morfológicas y distribuciones geográficas, así como el estado final de las herramientas desarrolladas para pre procesamiento de los documentos originales y para la consulta de fragmentos de texto estructurados semánticamente. Después de presentar los resultados se hace una comparación entre los diferentes objetivos planteados por el proyecto y los resultados obtenidos. Finalmente se hacen una serie de recomendaciones para que futuros proyectos aprovechen los estudios y herramientas producidos por este proyecto

    Reasoning over Taxonomic Change: Exploring Alignments for the Perelleschus Use Case

    Full text link
    Classifications and phylogenetic inferences of organismal groups change in light of new insights. Over time these changes can result in an imperfect tracking of taxonomic perspectives through the re-/use of Code-compliant or informal names. To mitigate these limitations, we introduce a novel approach for aligning taxonomies through the interaction of human experts and logic reasoners. We explore the performance of this approach with the Perelleschus use case of Franz & Cardona-Duque (2013). The use case includes six taxonomies published from 1936 to 2013, 54 taxonomic concepts (i.e., circumscriptions of names individuated according to their respective source publications), and 75 expert-asserted Region Connection Calculus articulations (e.g., congruence, proper inclusion, overlap, or exclusion). An Open Source reasoning toolkit is used to analyze 13 paired Perelleschus taxonomy alignments under heterogeneous constraints and interpretations. The reasoning workflow optimizes the logical consistency and expressiveness of the input and infers the set of maximally informative relations among the entailed taxonomic concepts. The latter are then used to produce merge visualizations that represent all congruent and non-congruent taxonomic elements among the aligned input trees. In this small use case with 6-53 input concepts per alignment, the information gained through the reasoning process is on average one order of magnitude greater than in the input. The approach offers scalable solutions for tracking provenance among succeeding taxonomic perspectives that may have differential biases in naming conventions, phylogenetic resolution, ingroup and outgroup sampling, or ostensive (member-referencing) versus intensional (property-referencing) concepts and articulations.Comment: 30 pages, 16 figure
    corecore