102 research outputs found

    NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata

    Get PDF
    In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input–output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.R.A.V. received support from the CIPRES project (NSF #EF-03314953 to W.P.M.), the FP7 Marie Curie Programme (Call FP7-PEOPLE-IEF-2008—Proposal No. 237046) and, for the NeXML implementation in TreeBASE, the pPOD project (NSF IIS 0629846); P.E.M. and J.S. received support from CIPRES (NSF #EF-0331495, #EF-0715370); M.T.H. was supported by NSF (DEB-ATOL-0732920); X.X. received support from NSERC (Canada) Discovery and RTI grants; W.P.M. received support from an NSERC (Canada) Discovery grant; J.C. received support from a Google Summer of Code 2007 grant; A.P. received support from a Google Summer of Code 2010 grant

    Converting Endangered Species Categories to Probabilities of Extinction for Phylogenetic Conservation Prioritization

    Get PDF
    Categories of imperilment like the global IUCN Red List have been transformed to probabilities of extinction and used to rank species by the amount of imperiled evolutionary history they represent (e.g. by the Edge of Existence programme). We investigate the stability of such lists when ranks are converted to probabilities of extinction under different scenarios.Using a simple example and computer simulation, we show that preserving the categories when converting such list designations to probabilities of extinction does not guarantee the stability of the resulting lists.Care must be taken when choosing a suitable transformation, especially if conservation dollars are allocated to species in a ranked fashion. We advocate routine sensitivity analyses

    NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata

    Get PDF
    In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input–output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML

    Improving phylogeny reconstruction at the strain level using peptidome datasets

    Get PDF
    Typical bacterial strain differentiation methods are often challenged by high genetic similarity between strains. To address this problem, we introduce a novel in silico peptide fingerprinting method based on conventional wet-lab protocols that enables the identification of potential strain-specific peptides. These can be further investigated using in vitro approaches, laying a foundation for the development of biomarker detection and application-specific methods. This novel method aims at reducing large amounts of comparative peptide data to binary matrices while maintaining a high phylogenetic resolution. The underlying case study concerns the Bacillus cereus group, namely the differentiation of Bacillus thuringiensis, Bacillus anthracis and Bacillus cereus strains. Results show that trees based on cytoplasmic and extracellular peptidomes are only marginally in conflict with those based on whole proteomes, as inferred by the established Genome-BLAST Distance Phylogeny (GBDP) method. Hence, these results indicate that the two approaches can most likely be used complementarily even in other organismal groups. The obtained results confirm previous reports about the misclassification of many strains within the B. cereus group. Moreover, our method was able to separate the B. anthracis strains with high resolution, similarly to the GBDP results as benchmarked via Bayesian inference and both Maximum Likelihood and Maximum Parsimony. In addition to the presented phylogenomic applications, whole-peptide fingerprinting might also become a valuable complementary technique to digital DNA-DNA hybridization, notably for bacterial classification at the species and subspecies level in the future.This research was funded by Grant AGL2013-44039-R from the Spanish “Plan Estatal de I+D+I”, and by Grant EM2014/046 from the “Plan Galego de investigación, innovación e crecemento 2011-2015”. BS was recipient of a Ramón y Cajal postdoctoral contractfrom the Spanish Ministry of Economyand Competitiveness. This work was also partially funded by the [14VI05] Contract-Programme from the University of Vigo and the Agrupamento INBIOMED from DXPCTSUG-FEDER unha maneira de facer Europa (2012/273).The research leading to these results has also received funding from the European Union’s Seventh Framework Programme FP7/REGPOT-2012-2013.1 under grant agreement n˚ 316265, BIOCAPS. This document reflects only the authors’ views and the European Union is not liable for any use that may be made of the information contained herein. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Updating the evolutionary history of Carnivora (Mammalia): a new species-level supertree complete with divergence time estimates

    Get PDF

    Multigene phylogeny of the Mustelidae: Resolving relationships, tempo and biogeographic history of a mammalian adaptive radiation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Adaptive radiation, the evolution of ecological and phenotypic diversity from a common ancestor, is a central concept in evolutionary biology and characterizes the evolutionary histories of many groups of organisms. One such group is the Mustelidae, the most species-rich family within the mammalian order Carnivora, encompassing 59 species classified into 22 genera. Extant mustelids display extensive ecomorphological diversity, with different lineages having evolved into an array of adaptive zones, from fossorial badgers to semi-aquatic otters. Mustelids are also widely distributed, with multiple genera found on different continents. As with other groups that have undergone adaptive radiation, resolving the phylogenetic history of mustelids presents a number of challenges because ecomorphological convergence may potentially confound morphologically based phylogenetic inferences, and because adaptive radiations often include one or more periods of rapid cladogenesis that require a large amount of data to resolve.</p> <p>Results</p> <p>We constructed a nearly complete generic-level phylogeny of the Mustelidae using a data matrix comprising 22 gene segments (~12,000 base pairs) analyzed with maximum parsimony, maximum likelihood and Bayesian inference methods. We show that mustelids are consistently resolved with high nodal support into four major clades and three monotypic lineages. Using Bayesian dating techniques, we provide evidence that mustelids underwent two bursts of diversification that coincide with major paleoenvironmental and biotic changes that occurred during the Neogene and correspond with similar bursts of cladogenesis in other vertebrate groups. Biogeographical analyses indicate that most of the extant diversity of mustelids originated in Eurasia and mustelids have colonized Africa, North America and South America on multiple occasions.</p> <p>Conclusion</p> <p>Combined with information from the fossil record, our phylogenetic and dating analyses suggest that mustelid diversification may have been spurred by a combination of faunal turnover events and diversification at lower trophic levels, ultimately caused by climatically driven environmental changes. Our biogeographic analyses show Eurasia as the center of origin of mustelid diversity and that mustelids in Africa, North America and South America have been assembled over time largely via dispersal, which has important implications for understanding the ecology of mustelid communities.</p
    corecore