Search CORE

76,622 research outputs found

Making species checklists understandable to machines : a shift from relational databases to ontologies

Author: Hyvönen Eero
Laurenne Nina
Saarenmaa Hannu
Tuominen Jouni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Abstract Background The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications. Results We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time. Conclusions The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more “intelligent” biological applications and services

Springer - Publisher Connector

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Fast, linked, and open – the future of taxonomic publishing for plants: launching the journal PhytoKeys

Author: Knapp Sandra
Kress W. John
Li De-Zhu
Penev Lyubomir
Renner Susanne S.
Publication venue
Publication date: 01/01/2010
Field of study

The paper describes the focus, scope and the rationale of PhytoKeys, a newly established, peer-reviewed, open-access journal in plant systematics. PhytoKeys is launched to respond to four main challenges of our time: (1) Appearance of electronic publications as amendments or even alternatives to paper publications; (2) Open Access (OA) as a new publishing model; (3) Linkage of electronic registers, indices and aggregators that summarize information on biological species through taxonomic names or their persistent identifiers (Globally Unique Identifiers or GUIDs; currently Life Science Identifiers or LSIDs); (4) Web 2.0 technologies that permit the semantic markup of, and semantic enhancements to, published biological texts. The journal will pursue cutting-edge technologies in publication and dissemination of biodiversity information while strictly following the requirements of the current International Code of Botanical Nomenclature (ICBN)

Directory of Open Access Journals

Open Access LMU

PubMed Central

A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. Application to the French Taxonomic Register, TAXREF

Author: Faron Zucker Catherine
Gargominy Olivier
Michel Franck
Tercerie Sandrine
Publication venue: HAL CCSD
Publication date: 22/10/2017
Field of study

International audienceTaxonomic registers are key tools to help us comprehend the diversity of nature. Publishing such registers in the Web of Data, following the standards and best practices of Linked Open Data (LOD), is a way of integrating multiple data sources into a world-scale, biological knowledge base. In this paper, we present an ongoing work aimed at the publication of TAXREF, the French national taxonomic register, on the Web of Data. Far beyond the mere translation of the TAXREF database into LOD standards, we show that the key point of this endeavor is the design of a model capable of capturing the two coexisting yet distinct realities underlying taxonomic registers, namely the nomenclature (the rules for naming biological entities) and the taxonomy (the description and characterization of these biological entities). We first analyze different modelling choices made to represent some international taxonomic registers as LOD, and we underline the issues that arise from these differences. Then, we propose a model aimed to tackle these issues. This model separates nomenclature from taxonomy, it is flexible enough to accommodate the ever-changing scientific consensus on taxonomy, and it adheres to the philosophy underpinning the Semantic Web standards. Finally, using the example of TAXREF, we show that the model enables interlinking with third-party LOD data sets, may they represent nomenclatural or taxonomic information

HAL-UNICE

INRIA a CCSD electronic archive server

Recommended from our members

Open Science principles for accelerating trait-based science across the Tree of Life.

Author: Adams Vanessa M
Alroy John
Andrew Samuel C
Ankenbrand Markus J
Balk Meghan A
Bland Lucie M
Boyle Brad L
Bravo-Avila Catherine H
Brennan Ian
Carthey Alexandra JR
Catullo Renee
Cavazos Brittany R
Chown Steven L
Conde Dalia A
Enquist Brian J
Fadrique Belen
Falster Daniel S
Feng Xiao
Gallagher Rachael V
Gibb Heloise
Halbritter Aud H
Hammock Jennifer
Hogan J Aaron
Holewa Hamish
Hope Michael
Iversen Colleen M
Jochum Malte
Kattge Jens
Kearney Michael
Keller Alexander
Mabee Paula
Madin Joshua S
Maitner Brian S
Manning Peter
McCormack Luke
Michaletz Sean T
Park Daniel S
Pearse William D
Penone Caterina
Perez Timothy M
Pineda-Munoz Silvia
Poelen Jorrit H
Ray Courtenay A
Rossetto Maurizio
Salguero-Gómez Roberto
Sauquet Hervé
Schneider Florian D
Sparrow Benjamin
Spasojevic Marko J
Telford Richard J
Tobias Joseph A
Vandvik Vigdis
Violle Cyrille
Walls Ramona
Weiss Katherine CB
Westoby Mark
Wright Ian J
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Synthesizing trait observations and knowledge across the Tree of Life remains a grand challenge for biodiversity science. Species traits are widely used in ecological and evolutionary science, and new data and methods have proliferated rapidly. Yet accessing and integrating disparate data sources remains a considerable challenge, slowing progress toward a global synthesis to integrate trait data across organisms. Trait science needs a vision for achieving global integration across all organisms. Here, we outline how the adoption of key Open Science principles-open data, open source and open methods-is transforming trait science, increasing transparency, democratizing access and accelerating global synthesis. To enhance widespread adoption of these principles, we introduce the Open Traits Network (OTN), a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across organisms. We demonstrate how adherence to Open Science principles is key to the OTN community and outline five activities that can accelerate the synthesis of trait data across the Tree of Life, thereby facilitating rapid advances to address scientific inquiries and environmental issues. Lessons learned along the path to a global synthesis of trait data will provide a framework for addressing similarly complex data science and informatics challenges

eScholarship - University of California

Oxford University Research Archive

Western Sydney ResearchDirect

Bern Open Repository and Information System (BORIS)

Monash University Research Portal

The Human Oral Microbiome Database: a web accessible resource for investigating oral microbe taxonomic and genomic information

Author: Baranova Oxana V.
Chen Tsute
Dewhirst Floyd E.
Izard Jacques
Lakshmanan Abirami
Yu Wen-Han
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 20/06/2010
Field of study

The human oral microbiome is the most studied human microflora, but 53% of the species have not yet been validly named and 35% remain uncultivated. The uncultivated taxa are known primarily from 16S rRNA sequence information. Sequence information tied solely to obscure isolate or clone numbers, and usually lacking accurate phylogenetic placement, is a major impediment to working with human oral microbiome data. The goal of creating the Human Oral Microbiome Database (HOMD) is to provide the scientific community with a body site-specific comprehensive database for the more than 600 prokaryote species that are present in the human oral cavity based on a curated 16S rRNA gene-based provisional naming scheme. Currently, two primary types of information are provided in HOMD—taxonomic and genomic. Named oral species and taxa identified from 16S rRNA gene sequence analysis of oral isolates and cloning studies were placed into defined 16S rRNA phylotypes and each given unique Human Oral Taxon (HOT) number. The HOT interlinks phenotypic, phylogenetic, genomic, clinical and bibliographic information for each taxon. A BLAST search tool is provided to match user 16S rRNA gene sequences to a curated, full length, 16S rRNA gene reference data set. For genomic analysis, HOMD provides comprehensive set of analysis tools and maintains frequently updated annotations for all the human oral microbial genomes that have been sequenced and publicly released. Oral bacterial genome sequences, determined as part of the Human Microbiome Project, are being added to the HOMD as they become available. We provide HOMD as a conceptual model for the presentation of microbiome data for other human body sites

DigitalCommons@University of Nebraska

Liberating links between datasets using lightweight data publishing: an example using plant names and the taxonomic literature

Author: Page Roderic
Publication venue: Pensoft
Publication date: 01/01/2018
Field of study

Constructing a biodiversity knowledge graph will require making millions of cross links between diversity entities in different datasets. Researchers trying to bootstrap the growth of the biodiversity knowledge graph by constructing databases of links between these entities lack obvious ways to publish these sets of links. One appealing and lightweight approach is to create a "datasette", a database that is wrapped together with a simple web server that enables users to query the data. Datasettes can be packaged into Docker containers and hosted online with minimal effort. This approach is illustrated using a dataset of links between globally unique identifiers for plant taxonomic namesand identifiers for the taxonomic articles that published those names

ZENODO

Directory of Open Access Journals

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Enlighten

ARPHA OAI-PMH Endpoint

ARPHA Preprints

mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

Author: Arron Shiffer
Benjamin Wolfe
Corinne F. Maurice
J. Gregory Caporaso
Jai Ram Rideout
Josh D. Neufeld
Nicholas A. Bokulich
Peter J. Turnbaugh
Rachel J. Dutton
Rob Knight
William G. Mercurio
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community

Repository for Publications and Research Data

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Recommended from our members

Patterns of contribution to citizen science biodiversity projects increase understanding of volunteers’ recording behaviour

Author: Boakes E. H.
Gliozzo G.
Haklay M.
Harvey M.
Roy D. B.
Seymour V.
Smith C.
Publication venue: Springer Science and Business Media LLC
Publication date: 13/09/2016
Field of study

The often opportunistic nature of biological recording via citizen science leads to taxonomic, spatial and temporal biases which add uncertainty to biodiversity estimates. However, such biases may also give valuable insight into volunteers’ recording behaviour. Using Greater London as a case-study we examined the composition of three citizen science datasets – from Greenspace Information for Greater London CIC, iSpot and iRecord - with respect to recorder contribution and spatial and taxonomic biases, i.e. when, where and what volunteers record. We found most volunteers contributed few records and were active for just one day. Each dataset had its own taxonomic and spatial signature suggesting that volunteers’ personal recording preferences may attract them towards particular schemes. There were also patterns across datasets: species’ abundance and ease of identification were positively associated with number of records, as was plant height. We found clear hotspots of recording activity, the 10 most popular sites containing open water. We note that biases are accrued as part of the recording process (e.g. species’ detectability) as well as from volunteer preferences. An increased understanding of volunteer behaviour gained from analysing the composition of records could thus enhance the fit between volunteers’ interests and the needs of scientific projects

City Research Online

UCL Discovery

PubMed Central

NERC Open Research Archive

EJT editorial standard for the semantic enhancement of specimen data in taxonomy literature

Author: Agosti Donat
Bénichou Laurence
Catapano Terry
Chester Chloë
Gérard Isabelle
Martens Koenraad
Sautter Guido
Publication venue: 'Museum National D''Histoire Naturelle'
Publication date: 01/01/2019
Field of study

This paper describes a set of guidelines for the citation of zoological and botanical specimens in the European Journal of Taxonomy. The guidelines stipulate controlled vocabularies and precise formats for presenting the specimens examined within a taxonomic publication, which allow for the rich data associated with the primary research material to be harvested, distributed and interlinked online via international biodiversity data aggregators. Herein we explain how the EJT editorial standard was defined and how this initiative fits into the journal's project to semantically enhance its publications using the Plazi TaxPub DTD extension. By establishing a standardised format for the citation of taxonomic specimens, the journal intends to widen the distribution of and improve accessibility to the data it publishes. Authors who conform to these guidelines will benefit from higher visibility and new ways of visualising their work. In a wider context, we hope that other taxonomy journals will adopt this approach to their publications, adapting their working methods to enable domain-specific text mining to take place. If specimen data can be efficiently cited, harvested and linked to wider resources, we propose that there is also the potential to develop alternative metrics for assessing impact and productivity within the natural science

ZENODO

Ghent University Academic Bibliography

Directory of Open Access Journals

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Hochschulschriftenserver - Universität Frankfurt am Main

Simple identification tools in FishBase

Author: Atanacio Rachek
Bailly Nicolas
Froese Rainer
Reyes Jr. Rodolfo
Publication venue: EUT - Edizioni Università di Trieste
Publication date: 01/01/2010
Field of study

Simple identification tools for fish species were included in the FishBase information system from its inception. Early tools made use of the relational model and characters like fin ray meristics. Soon pictures and drawings were added as a further help, similar to a field guide. Later came the computerization of existing dichotomous keys, again in combination with pictures and other information, and the ability to restrict possible species by country, area, or taxonomic group. Today, www.FishBase.org offers four different ways to identify species. This paper describes these tools with their advantages and disadvantages, and suggests various options for further development. It explores the possibility of a holistic and integrated computeraided strategy

OceanRep

OpenstarTs