11 research outputs found

    A Taxonomic Search Engine: Federating taxonomic databases using web services

    Get PDF
    BACKGROUND: The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. RESULTS: The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. CONCLUSION: The Taxonomic Search Engine is available at and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names

    An edit script for taxonomic classifications

    Get PDF
    BACKGROUND: The NCBI taxonomy provides one of the most powerful ways to navigate sequence data bases but currently users are forced to formulate queries according to a single taxonomic classification. Given that there is not universal agreement on the classification of organisms, providing a single classification places constraints on the questions biologists can ask. However, maintaining multiple classifications is burdensome in the face of a constantly growing NCBI classification. RESULTS: In this paper, we present a solution to the problem of generating modifications of the NCBI taxonomy, based on the computation of an edit script that summarises the differences between two classification trees. Our algorithms find the shortest possible edit script based on the identification of all shared subtrees, and only take time quasi linear in the size of the trees because classification trees have unique node labels. CONCLUSION: These algorithms have been recently implemented, and the software is freely available for download from

    Comparative tests of ectoparasite species richness in seabirds

    Get PDF
    Background: The diversity of parasites attacking a host varies substantially among different host species. Understanding the factors that explain these patterns of parasite diversity is critical to identifying the ecological principles underlying biodiversity. Seabirds (Charadriiformes, Pelecaniformes and Procellariiformes) and their ectoparasitic lice (Insecta: Phthiraptera) are ideal model groups in which to study correlates of parasite species richness. We evaluated the relative importance of morphological (body size, body weight, wingspan, bill length), life-history (longevity, clutch size), ecological (population size, geographical range) and behavioural (diving versus non-diving) variables as predictors of louse diversity on 413 seabird hosts species. Diversity was measured at the level of louse suborder, genus, and species, and uneven sampling of hosts was controlled for using literature citations as a proxy for sampling effort. Results: The only variable consistently correlated with louse diversity was host population size and to a lesser extent geographic range. Other variables such as clutch size, longevity, morphological and behavioural variables including body mass showed inconsistent patterns dependent on the method of analysis. Conclusion: The comparative analysis presented herein is (to our knowledge) the first to test correlates of parasite species richness in seabirds. We believe that the comparative data and phylogeny provide a valuable framework for testing future evolutionary hypotheses relating to the diversity and distribution of parasites on seabirds

    The shape of human gene family phylogenies

    Get PDF
    BACKGROUND: The shape of phylogenetic trees has been used to make inferences about the evolutionary process by comparing the shapes of actual phylogenies with those expected under simple models of the speciation process. Previous studies have focused on speciation events, but gene duplication is another lineage splitting event, analogous to speciation, and gene loss or deletion is analogous to extinction. Measures of the shape of gene family phylogenies can thus be used to investigate the processes of gene duplication and loss. We make the first systematic attempt to use tree shape to study gene duplication using human gene phylogenies. RESULTS: We find that gene duplication has produced gene family trees significantly less balanced than expected from a simple model of the process, and less balanced than species phylogenies: the opposite to what might be expected under the 2R hypothesis. CONCLUSION: While other explanations are plausible, we suggest that the greater imbalance of gene family trees than species trees is due to the prevalence of tandem duplications over regional duplications during the evolution of the human genome

    Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library

    Get PDF
    Background: The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level metadata. Given that the article is the standard unit of citation, this makes it difficult to locate cited literature in BHL. Adding the ability to easily find articles in BHL would greatly enhance the value of the archive. Description: A service was developed to locate articles in BHL based on matching article metadata to BHL metadata using approximate string matching, regular expressions, and string alignment. This article locating service is exposed as a standard OpenURL resolver on the BioStor web site http://biostor.org/openurl/. This resolver can be used on the web, or called by bibliographic tools that support OpenURL. Conclusions: BioStor provides tools for extracting, annotating, and visualising articles from the Biodiversity Heritage Library. BioStor is available from http://biostor.org

    Cospeciation

    No full text

    LSID Tester, a tool for testing Life Science Identifier resolution services-0

    No full text
    15.<p><b>Copyright information:</b></p><p>Taken from "LSID Tester, a tool for testing Life Science Identifier resolution services"</p><p>http://www.scfbm.org/content/3/1/2</p><p>Source Code for Biology and Medicine 2008;3():2-2.</p><p>Published online 18 Feb 2008</p><p>PMCID:PMC2276318.</p><p></p

    LSID Tester, a tool for testing Life Science Identifier resolution services

    Get PDF
    BackgroundLife Science Identifiers (LSIDs) are persistent, globally unique identifiers for biological objects. The decentralised nature of LSIDs makes them attractive for identifying distributed resources. Data of interest to biodiversity researchers (including specimen records, images, taxonomic names, and DNA sequences) are distributed over many different providers, and this community has adopted LSIDs as the identifier of choice

    TBMap: a taxonomic perspective on the phylogenetic database TreeBASE

    Get PDF
    &lt;i&gt;Background&lt;/i&gt; TreeBASE is currently the only available large-scale database of published organismal phylogenies. Its utility is hampered by a lack of taxonomic consistency, both within the database, and with names of organisms in external genomic, specimen, and taxonomic databases. The extent to which the phylogenetic knowledge in TreeBASE becomes integrated with these other sources is limited by this lack of consistency. &lt;i&gt;Description&lt;/i&gt; Taxonomic names in TreeBASE were mapped onto names in the external taxonomic databases IPNI, ITIS, NCBI, and uBio, and graph G of these mappings was constructed. Additional edges representing taxonomic synonymies were added to G, then all components of G were extracted. These components correspond to "name clusters", and group together names in TreeBASE that are inferred to refer to the same taxon. The mapping to NCBI enables hierarchical queries to be performed, which can improve TreeBASE information retrieval by an order of magnitude. &lt;i&gt;Conclusion&lt;/i&gt; TBMap database provides a mapping of the bulk of the names in TreeBASE to names in external taxonomic databases, and a clustering of those mappings into sets of names that can be regarded as equivalent. This mapping enables queries and visualisations that cannot otherwise be constructed
    corecore