28 research outputs found

    Research applications of primary biodiversity databases in the digital age.

    Get PDF
    Our world is in the midst of unprecedented change-climate shifts and sustained, widespread habitat degradation have led to dramatic declines in biodiversity rivaling historical extinction events. At the same time, new approaches to publishing and integrating previously disconnected data resources promise to help provide the evidence needed for more efficient and effective conservation and management. Stakeholders have invested considerable resources to contribute to online databases of species occurrences. However, estimates suggest that only 10% of biocollections are available in digital form. The biocollections community must therefore continue to promote digitization efforts, which in part requires demonstrating compelling applications of the data. Our overarching goal is therefore to determine trends in use of mobilized species occurrence data since 2010, as online systems have grown and now provide over one billion records. To do this, we characterized 501 papers that use openly accessible biodiversity databases. Our standardized tagging protocol was based on key topics of interest, including: database(s) used, taxa addressed, general uses of data, other data types linked to species occurrence data, and data quality issues addressed. We found that the most common uses of online biodiversity databases have been to estimate species distribution and richness, to outline data compilation and publication, and to assist in developing species checklists or describing new species. Only 69% of papers in our dataset addressed one or more aspects of data quality, which is low considering common errors and biases known to exist in opportunistic datasets. Globally, we find that biodiversity databases are still in the initial stages of data compilation. Novel and integrative applications are restricted to certain taxonomic groups and regions with higher numbers of quality records. Continued data digitization, publication, enhancement, and quality control efforts are necessary to make biodiversity science more efficient and relevant in our fast-changing environment

    Research applications of primary biodiversity databases in the digital age

    Get PDF
    Our world is in the midst of unprecedented change-climate shifts and sustained, widespread habitat degradation have led to dramatic declines in biodiversity rivaling historical extinction events. At the same time, new approaches to publishing and integrating previously disconnected data resources promise to help provide the evidence needed for more efficient and effective conservation and management. Stakeholders have invested considerable resources to contribute to online databases of species occurrences. However, estimates suggest that only 10% of biocollections are available in digital form. The biocollections community must therefore continue to promote digitization efforts, which in part requires demonstrating compelling applications of the data. Our overarching goal is therefore to determine trends in use of mobilized species occurrence data since 2010, as online systems have grown and now provide over one billion records. To do this, we characterized 501 papers that use openly accessible biodiversity databases. Our standardized tagging protocol was based on key topics of interest, including: database(s) used, taxa addressed, general uses of data, other data types linked to species occurrence data, and data quality issues addressed

    Methods for broad-scale plant phenology assessments using citizen scientists’ photographs

    Get PDF
    © 2020 Barve et al. Applications in Plant Sciences is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America Premise: Citizen science platforms for sharing photographed digital vouchers, such as iNaturalist, are a promising source of phenology data, but methods and best practices for use have not been developed. Here we introduce methods using Yucca flowering phenology as a case study, because drivers of Yucca phenology are not well understood despite the need to synchronize flowering with obligate pollinators. There is also evidence of recent anomalous winter flowering events, but with unknown spatiotemporal extents. Methods: We collaboratively developed a rigorous, consensus-based approach for annotating and sharing whole plant and flower presence data from iNaturalist and applied it to Yucca records. We compared spatiotemporal flowering coverage from our annotations with other broad-scale monitoring networks (e.g., the National Phenology Network) in order to determine the unique value of photograph-based citizen science resources. Results: Annotations from iNaturalist were uniquely able to delineate extents of unusual flowering events in Yucca. These events, which occurred in two different regions of the Desert Southwest, did not appear to disrupt the typical-period flowering. Discussion: Our work demonstrates that best practice approaches to scoring iNaturalist records provide fine-scale delimitation of phenological events. This approach can be applied to other plant groups to better understand how phenology responds to changing climate

    Machine learning using digitized herbarium specimens to advance phenological research

    Get PDF
    Machine learning (ML) has great potential to drive scientific discovery by harvesting data from images of herbarium specimens—preserved plant material curated in natural history collections—but ML techniques have only recently been applied to this rich resource. ML has particularly strong prospects for the study of plant phenological events such as growth and reproduction. As a major indicator of climate change, driver of ecological processes, and critical determinant of plant fitness, plant phenology is an important frontier for the application of ML techniques for science and society. In the present article, we describe a generalized, modular ML workflow for extracting phenological data from images of herbarium specimens, and we discuss the advantages, limitations, and potential future improvements of this workflow. Strategic research and investment in specimen-based ML methods, along with the aggregation of herbarium specimen data, may give rise to a better understanding of life on Earth

    Developing a vocabulary and ontology for modeling insect natural history data: example data, use cases, and competency questions

    Get PDF
    Insects are possibly the most taxonomically and ecologically diverse class of multicellular organisms on Earth. Consequently, they provide nearly unlimited opportunities to develop and test ecological and evolutionary hypotheses. Currently, however, large-scale studies of insect ecology, behavior, and trait evolution are impeded by the difficulty in obtaining and analyzing data derived from natural history observations of insects. These data are typically highly heterogeneous and widely scattered among many sources, which makes developing robust information systems to aggregate and disseminate them a significant challenge. As a step towards this goal, we report initial results of a new effort to develop a standardized vocabulary and ontology for insect natural history data. In particular, we describe a new database of representative insect natural history data derived from multiple sources (but focused on data from specimens in biological collections), an analysis of the abstract conceptual areas required for a comprehensive ontology of insect natural history data, and a database of use cases and competency questions to guide the development of data systems for insect natural history data. We also discuss data modeling and technology-related challenges that must be overcome to implement robust integration of insect natural history data

    Test Phenology Annotations From Herbarium Specimen Dataset - Prunus serotina

    No full text
    This is a test dataset of phenology annotations for the Black Cherry, P. serotina, used in the submitted manuscript: Brenskelle, L., B. Stucky, J. Deck, R. Walls, R. P. Guralnick [submitted]. Integrating herbarium specimen observations into global phenology data systems. Applications in Plant Sciences

    Methods, New Software Tools, and Best Practices for Developing High-quality Training Data for Machine Learning-based Image Analysis in Biodiversity Research

    No full text
    Recent progress in using deep learning techniques to automate the analysis of complex image data is opening up exciting new avenues for research in biodiversity science. However, potential applications of machine learning methods in biodiversity research are often limited by the relative scarcity of data suitable for training machine learning models. Development of high-quality training data sets can be a surprisingly challenging task that can easily consume hundreds of person-hours of time. In this talk, we present the results of our recent work implementing and comparing several different methods for generating annotated, biodiversity-oriented image data for training machine learning models, including collaborative expert scoring, local volunteer image annotators with on-site training, and distributed, remote image annotation via citizen science platforms. We discuss error rates, among-annotator variance, and depth of coverage required to ensure highly reliable image annotations. We also discuss time considerations and efficiency of the various methods. Finally, we present new software, called ImageAnt (currently under development), that supports efficient, highly flexible image annotation workflows. ImageAnt was created primarily in response to the challenges we discovered in our own efforts to generate image-based training data for machine learning models. ImageAnt features a simple user interface and can be used to implement sophisticated, adaptive scripting of image annotation tasks

    Extending Darwin Core to incorporate data about material condition and absolute deep time

    No full text
    As part of efforts to mobilize zooarchaeological collections data, there is a strong need for new terms that can extend the Darwin Core standard in order to describe material condition, preparation history, and chronology. These data are important for understanding the full context of specimens from an array of natural and cultural heritage disciplines, especially those involving deep time, such as paleontology and zooarchaeology. These disciplines offer pre- and early Anthropocene biodiversity baselines to recognize and understand the deep history of human-environment interactions and use this information for research and conservation. They also provide an invaluable perspective about climate change in the past, thus providing insight into future climate change and impacts on ecology and biodiversity. We propose two new extensions: one for chronology, and one for material condition. While Darwin Core does currently accommodate data about lithostratigraphy, the chronology extension will allow for sharing of absolute dates and dating protocols. Additionally, the material condition extension will provide proper means for sharing data that, thus far, have been lumped under the Darwin Core term 'preparations', which limits their discoverability to users. This includes data about skeletal elements, taphonomy, and preparation history. Here we present these two extensions to Darwin Core and open the discussion about improving the proposed terms and definitions in the extensions we developed

    Published examples using the new Chronometric extension to Darwin Core

    No full text
    The temporality of specimens is an often overlooked but quintessential part of using aggregated biodiversity occurrences for research, especially when millions of these occurrences exist in deep time. Presently in Darwin Core, there are terms for describing the geological context of specimens, which is needed for paleontological specimens. However, information about the contextual absolute date associated with a specimen, and how that date was generated is not supported in Darwin Core, but would strongly enhance usability for research. Providers do occasionally try provisioning this information, but it is currently hidden in a few different Darwin Core fields, making it hard to discover and nearly impossible to search for in biodiversity portals. Here we provide an overview of where absolute date content for paleontological and archaeological specimens are currently found in published specimens records. We will then introduce a working Darwin Core extension that focuses on chronometric content, and demonstrate the use of this extension with published datasets from the zooarchaeological and paleontological communities. This new advancement will allow providers to make these crucial data available, researchers to easily find the temporal range associated with an occurrence, evaluate how this range was determined, and compile occurrences based on their shared ages to help streamline the research process
    corecore