6 research outputs found

    Repositories for Taxonomic Data: Where We Are and What is Missing

    Get PDF
    AbstractNatural history collections are leading successful large-scale projects of specimen digitization (images, metadata, DNA barcodes), thereby transforming taxonomy into a big data science. Yet, little effort has been directed towards safeguarding and subsequently mobilizing the considerable amount of original data generated during the process of naming 15,000–20,000 species every year. From the perspective of alpha-taxonomists, we provide a review of the properties and diversity of taxonomic data, assess their volume and use, and establish criteria for optimizing data repositories. We surveyed 4113 alpha-taxonomic studies in representative journals for 2002, 2010, and 2018, and found an increasing yet comparatively limited use of molecular data in species diagnosis and description. In 2018, of the 2661 papers published in specialized taxonomic journals, molecular data were widely used in mycology (94%), regularly in vertebrates (53%), but rarely in botany (15%) and entomology (10%). Images play an important role in taxonomic research on all taxa, with photographs used in &amp;gt;80% and drawings in 58% of the surveyed papers. The use of omics (high-throughput) approaches or 3D documentation is still rare. Improved archiving strategies for metabarcoding consensus reads, genome and transcriptome assemblies, and chemical and metabolomic data could help to mobilize the wealth of high-throughput data for alpha-taxonomy. Because long-term—ideally perpetual—data storage is of particular importance for taxonomy, energy footprint reduction via less storage-demanding formats is a priority if their information content suffices for the purpose of taxonomic studies. Whereas taxonomic assignments are quasifacts for most biological disciplines, they remain hypotheses pertaining to evolutionary relatedness of individuals for alpha-taxonomy. For this reason, an improved reuse of taxonomic data, including machine-learning-based species identification and delimitation pipelines, requires a cyberspecimen approach—linking data via unique specimen identifiers, and thereby making them findable, accessible, interoperable, and reusable for taxonomic research. This poses both qualitative challenges to adapt the existing infrastructure of data centers to a specimen-centered concept and quantitative challenges to host and connect an estimated \le 2 million images produced per year by alpha-taxonomic studies, plus many millions of images from digitization campaigns. Of the 30,000–40,000 taxonomists globally, many are thought to be nonprofessionals, and capturing the data for online storage and reuse therefore requires low-complexity submission workflows and cost-free repository use. Expert taxonomists are the main stakeholders able to identify and formalize the needs of the discipline; their expertise is needed to implement the envisioned virtual collections of cyberspecimens. [Big data; cyberspecimen; new species; omics; repositories; specimen identifier; taxonomy; taxonomic data.]</jats:p

    Repositories for Taxonomic Data : where We Are and What is Missing

    No full text
    Natural history collections are leading successful large-scale projects of specimen digitization (images, metadata, DNA barcodes), thereby transforming taxonomy into a big data science. Yet, little effort has been directed towards safeguarding and subsequently mobilizing the considerable amount of original data generated during the process of naming 15,000-20,000 species every year. From the perspective of alpha-taxonomists, we provide a review of the properties and diversity of taxonomic data, assess their volume and use, and establish criteria for optimizing data repositories. We surveyed 4113 alpha-taxonomic studies in representative journals for 2002, 2010, and 2018, and found an increasing yet comparatively limited use of molecular data in species diagnosis and description. In 2018, of the 2661 papers published in specialized taxonomic journals, molecular data were widely used in mycology (94%), regularly in vertebrates (53%), but rarely in botany (15%) and entomology (10%). Images play an important role in taxonomic research on all taxa, with photographs used in >80% and drawings in 58% of the surveyed papers. The use of omics (high-throughput) approaches or 3D documentation is still rare. Improved archiving strategies for metabarcoding consensus reads, genome and transcriptome assemblies, and chemical and metabolomic data could help to mobilize the wealth of high-throughput data for alpha-taxonomy. Because long-term-ideally perpetual-data storage is of particular importance for taxonomy, energy footprint reduction via less storage-demanding formats is a priority if their information content suffices for the purpose of taxonomic studies. Whereas taxonomic assignments are quasifacts for most biological disciplines, they remain hypotheses pertaining to evolutionary relatedness of individuals for alpha-taxonomy. For this reason, an improved reuse of taxonomic data, including machine-learning-based species identification and delimitation pipelines, requires a cyberspecimen approach-linking data via unique specimen identifiers, and thereby making them findable, accessible, interoperable, and reusable for taxonomic research. This poses both qualitative challenges to adapt the existing infrastructure of data centers to a specimen-centered concept and quantitative challenges to host and connect an estimated \le 2 million images produced per year by alpha-taxonomic studies, plus many millions of images from digitization campaigns. Of the 30,000-40,000 taxonomists globally, many are thought to be nonprofessionals, and capturing the data for online storage and reuse therefore requires low-complexity submission workflows and cost-free repository use. Expert taxonomists are the main stakeholders able to identify and formalize the needs of the discipline; their expertise is needed to implement the envisioned virtual collections of cyberspecimens. [Big data; cyberspecimen; new species; omics; repositories; specimen identifier; taxonomy; taxonomic data.].publishe

    Completing a taxonomic puzzle: integrative review of geckos of the Paroedura bastardi species complex (Squamata, Gekkonidae)

    Get PDF
    The Paroedura bastardi clade, a subgroup of the Madagascan gecko genus Paroedura, currently comprises four nominal species: P. bastardi, supposedly widely distributed in southern and western Madagascar, P. ibityensis, a montane endemic, and P. tanjaka and P. neglecta, both restricted to the central west region of the island. Previous work has shown that Paroedura bastardi is a species complex with several strongly divergent mitochondrial lineages. Based on one mitochondrial and two nuclear markers, plus detailed morphological data, we undertake an integrative revision of this species complex. Using a representative sampling for seven nuclear and five mitochondrial genes we furthermore propose a phylogenetic hypothesis of relationships among the species in this clade. Our analyses reveal at least three distinct and independent evolutionary lineages currently referred to P. bastardi. Conclusive evidence for the species status of these lineages comes from multiple cases of syntopic occurrence without genetic admixture or morphological intermediates, suggesting reproductive isolation. We discuss the relevance of this line of evidence and the conditions under which concordant differentiation in unlinked loci under sympatry provides a powerful approach to species delimitation, and taxonomically implement our findings by (1) designating a lectotype for Paroedura bastardi, now restricted to the extreme South-East of Madagascar, (2) resurrecting of the binomen Paroedura guibeae Dixon & Kroll, 1974, which is applied to the species predominantly distributed in the South-West, and (3) describing a third species, Paroedura rennerae sp. nov., which has the northernmost distribution within the species complex

    Data storage and data re-use in taxonomy-the need for improved storage and accessibility of heterogeneous data

    Get PDF
    Gemeinholzer B, Vences M, Beszteri B, et al. Data storage and data re-use in taxonomy-the need for improved storage and accessibility of heterogeneous data. ORGANISMS DIVERSITY &amp; EVOLUTION. 2020;20(1):1-8.The ability to rapidly generate and share molecular, visual, and acoustic data, and to compare them with existing information, and thereby to detect and name biological entities is fundamentally changing our understanding of evolutionary relationships among organisms and is also impacting taxonomy. Harnessing taxonomic data for rapid, automated species identification by machine learning tools or DNA metabarcoding techniques has great potential but will require their review, accessible storage, comprehensive comparison, and integration with prior knowledge and information. Currently, data production, management, and sharing in taxonomic studies are not keeping pace with these needs. Indeed, a survey of recent taxonomic publications provides evidence that few species descriptions in zoology and botany incorporate DNA sequence data. The use of modern high-throughput (-omics) data is so far the exception in alpha-taxonomy, although they are easily stored in GenBank and similar databases. By contrast, for the more routinely used image data, the problem is that they are rarely made available in openly accessible repositories. Improved sharing and re-using of both types of data requires institutions that maintain long-term data storage and capacity with workable, user-friendly but highly automated pipelines. Top priority should be given to standardization and pipeline development for the easy submission and storage of machine-readable data (e.g., images, audio files, videos, tables of measurements). The taxonomic community in Germany and the German Federation for Biological Data are researching options for a higher level of automation, improved linking among data submission and storage platforms, and for making existing taxonomic information more readily accessible

    Rana guttulata Boulenger 1881

    No full text
    Rana guttulata Boulenger, 1881 Lectotype. BMNH 1947.2.25.51, designated lectotype by Blommers-Schlösser and Blanc (1991), from the region of Betsileo (S.E. Betsileo), collected by Bartlett. Paralectotypes. Four specimens, BMNH 1947.2. 25.48–50, BMNH 1947.2. 25.52, with same collection locality and data as lectotype. Junior synonym. Rana pigra Mocquard, 1900. Holotype: MNHN 1899.410, from 'forêt d' Ikongo'. Referred material. For field numbers of additional specimens referred to M. guttulat us genetically, see Figure 1. For morphological measurements of types and five additional specimens in the ZSM collection, see Table 1. Remarks. Mantidactylus (M.) guttulatus is a large nocturnal stream-dwelling frog, distributed at elevations from 810 m a.s.l. (Vohidrazana) to ca. 1500 m a.s.l. (Antoetra). It is typically found in slow-moving parts of small streams in rainforest, and almost, nothing is known about its natural history. Based on genetic data herein, confirmed localities are (from north to south) Fierenana, Andasibe, Maromizaha, Mangabe region, An' Ala and Vohidrazana in the Northern Central East, and Antoetra, Vohiparara, Ranomafana and Ivohibe in the Southern Central East of Madagascar (map in Figure 1). If Rana pigra is correctly assigned as a junior synonym to M. guttulatus, then a further locality would be Ikongo Forest. The definition of this species has had a very convoluted history, and many populations and specimens have intermittently been named M. guttulatus. A complete revision of all these uses in the literature is beyond the scope of this paper. Glaw and Vences (2007) defined populations from Tsaratanana as M. guttulatus, and calls and tadpoles of the lineage occurring at Tsaratanana were also described under this name (Vences et al. 2004; Schulze et al. 2016). However, this population corresponds to the new species M. radaka sp. nov. described below. The tadpole described by Altig and McDiarmid (2006) as M. guttulatus actually belongs to M. majori, a representative of the subgenus Hylobatrachus (Randrianiaina et al. 2011).Published as part of Rancilhac, Loïs, Bruy, Teddy, Scherz, Mark D., Pereira, Elvis Almeida, Preick, Michaela, Straube, Nicolas, Lyra, Mariana L., Ohler, Annemarie, Streicher, Jeffrey W., Andreone, Franco, Crottini, Angelica, Hutter, Carl R., Randrianantoandro, J. Christian, Rakotoarison, Andolalao, Glaw, Frank, Hofreiter, Michael & Vences, Miguel, 2020, Target-enriched DNA sequencing from historical type material enables a partial revision of the Madagascar giant stream frogs (genus Mantidactylus), pp. 87-118 in Journal of Natural History (J. Nat. Hist.) (J. Nat. Hist.) 54 (1 - 4) on page 106, DOI: 10.1080/00222933.2020.1748243, http://zenodo.org/record/502065

    Mantidactylus (Mantidactylus) grandidieri Mocquard 1895

    No full text
    Mantidactylus (Mantidactylus) grandidieri Mocquard, 1895 Syntypes. Two specimens, MNHN 1883.580 and MNHN 1895.255 collected by Humblot and Grandidier, from ' Madagascar... côte Est'. Referred material. For field numbers of additional specimens referred to M. grandidieri genetically, see Figure 1. For morphological measurements of types and additional six specimens in the ZSM collection, see Table 1. Remarks. Mantidactylus (M.) grandidieri is a large nocturnal stream-dwelling frog, distributed at elevations from near sea level (Nosy Mangabe) to ca. 1500 m a.s.l. (Ambohitantely), but appears to be more common at low elevations <900 m a.s.l. It is found in slow-moving parts of streams in rainforest, often among rocks and boulders, and almost nothing is known about its natural history. Based on genetic data herein, confirmed localities are (from north to south) Marojejy, Sambava, Besariaka, Tsararano, Antalaha, Masoala, Ilampy, Ambodivoangy, Andranofotsy, Nosy Mangabe, Angozongahy (west slope of Makira Reserve), Antsahataloka, Ambatoroma, Ambohitantely, Ambatodisakoana, Fierenana, Vohidrazana, Moramanga and the Mangabe region. As with M. guttulatus, the definition of this species has changed multiple times in the past. A complete revision of all these uses in the literature is beyond the scope of this paper. Glaw and Vences (2007) partially following Blommers-Schlösser and Blanc (1991) used the name M. grandidieri primarily to refer to highland populations in the Northern Central East and Southern Central East, which, according to the present revision, are to be referred to as M. guttulatus. Instead, populations of M. grandidieri were named Mantidactylus sp. aff. grandidieri 'North' by Glaw and Vences (2007), Mantidactylus sp. 57 by Vieites et al. (2009) and Mantidactylus sp. Ca57 by Perl et al. (2014).Published as part of Rancilhac, Loïs, Bruy, Teddy, Scherz, Mark D., Pereira, Elvis Almeida, Preick, Michaela, Straube, Nicolas, Lyra, Mariana L., Ohler, Annemarie, Streicher, Jeffrey W., Andreone, Franco, Crottini, Angelica, Hutter, Carl R., Randrianantoandro, J. Christian, Rakotoarison, Andolalao, Glaw, Frank, Hofreiter, Michael & Vences, Miguel, 2020, Target-enriched DNA sequencing from historical type material enables a partial revision of the Madagascar giant stream frogs (genus Mantidactylus), pp. 87-118 in Journal of Natural History (J. Nat. Hist.) (J. Nat. Hist.) 54 (1 - 4) on page 107, DOI: 10.1080/00222933.2020.1748243, http://zenodo.org/record/502065
    corecore