A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers (such as DOIs and LSIDs), and the implementation of services that link those identifiers

Altschul

Dellavalle

Martin

Moreau

Ouellette

Page

Patterson

R. D. M. Page

Saux

Smith

Stein

Zamors'ky  

English

Roderic Page

Nature Precedings

A new species of Probolomyrmex from Madagascar. In: Snelling

Authoritative sources in a hyperlinked environment,

Authors of plant names. Royal Botanic Gardens,

Basic local alignment search tool,

Biodiversity informatics: organising and linking across the spectrum of life,

Cool URIs for the Semantic Web',

DNA barcoding for effective biodiversity assessment of a hyperdiverse arthropod group: the ants of Madagascar,

Dracula ant phylogeny as inferred by nuclear 28S rDNA sequences and implications for ant systematics (Hymenoptera: Formicidae: Amblyoponinae), Molecular Phylogenetics and Evolution

DSpace: An Open Source institutional repository for digital material, D-Lib Magazine

Evaluating alternative hypotheses for the early evolution and diversification of ants,

Globally distributed object identification for biological knowledgebases,

going, gone: lost Internet references,

Integrating biological databases,

Molecular Evidence for definition of genera in the Oxylobium group (Fabaceae: Mirbelieae), Systematic Botany

Molecular phylogeny of the endemic Philippine rodent Apomys (Muridae) and the dynamics of diversification in an oceanic archipelago,

Molecular systematics of basal subfamilies of ants using 28S rRNA (Hymenoptera: Formicidae), Molecular Phylogenetics and Evolution

Phylogeny of the ants: diversification in the age of angiosperms,

Scientific names are ambiguous as identifiers for biological taxa: their context and definition are required for accurate data integration. In:

Taxonomic Indexing - Extending the Role of Taxonomy,

Taxonomic names, metadata, and the Semantic Web, Biodiversity Informatics

TBMap: a taxonomic perspective on the phylogenetic database TreeBASE,

The case of impact factor versus taxonomy: a proposal,

The impact of Life Science Identifier on informatics data, Drug Discovery Today

The PageRank citation ranking: bringing order to the Web',

Why OpenURL?, D-Lib Magazine

Biodiversity informatics: the challenge of linking data and the role of shared identifiers

Background: Linking together the data of interest to biodiversity researchers (including specimen records, images, taxonomic names, and DNA sequences) requires services that can mint, resolve, and discover globally unique identifiers (including, but not limited to, DOIs, A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers [such as Digital Object Identifiers (DOIs) and Life Science Identifiers (LSIDs)], and the implementation of services that link those identifier

Biodiversity informatics: the challenge of linking data and the role of shared identifiers

Abstract

Similar works

Full text

Available Versions

Nature Precedings

Enlighten

Enlighten: Publications

Crossref