15,731 research outputs found
Approaches to estimating the universe of natural history collections data
This contribution explores the problem of recognizing and measuring the universe of specimen-level data existing in Natural History Collections around the world, in absence of a complete, world-wide census or register. Estimates of size seem necessary to plan for resource allocation for digitization or data capture, and may help represent how many vouchered primary biodiversity data (in terms of collections, specimens or curatorial units) might remain to be mobilized.
Three general approaches are proposed for further development, and initial estimates are given. Probabilistic models involve crossing data from a set of biodiversity datasets, finding commonalities and estimating the likelihood of totally obscure data from the fraction of known data missing from specific datasets in the set. Distribution models aim to find the underlying distribution of collections’ compositions, figuring out the occult sector of the distributions. Finally, case studies seek to compare digitized data from collections known to the world to the amount of data known to exist in the collection but not generally available or not digitized.
Preliminary estimates range from 1.2 to 2.1 gigaunits, of which a mere 3% at most is currently web-accessible through GBIF’s mobilization efforts. However, further data and analyses, along with other approaches relying more heavily on surveys, might change the picture and possibly help narrow the estimate. In particular, unknown collections not having emerged through literature are the major source of uncertainty
Bridging the biodiversity data gaps: Recommendations to meet users’ data needs
A strong case has been made for freely available, high quality data on species occurrence, in order to track changes in biodiversity. However, one of the main issues surrounding the provision of such data is that sources vary in quality, scope, and accuracy. Therefore publishers of such data must face the challenge of maximizing quality, utility and breadth of data coverage, in order to make such data useful to users. Here, we report a number of recommendations that stem from a content need assessment survey conducted by the Global Biodiversity Information Facility (GBIF). Through this survey, we aimed to distil the main user needs regarding biodiversity data. We find a broad range of recommendations from the survey respondents, principally concerning issues such as data quality, bias, and coverage, and extending ease of access. We recommend a candidate set of actions for the GBIF that fall into three classes: 1) addressing data gaps, data volume, and data quality, 2) aggregating new kinds of data for new applications, and 3) promoting ease-of-use and providing incentives for wider use. Addressing the challenge of providing high quality primary biodiversity data can potentially serve the needs of many international biodiversity initiatives, including the new 2020 biodiversity targets of the Convention on Biological Diversity, the emerging global biodiversity observation network (GEO BON), and the new Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES)
Occurrence cubes : a new paradigm for aggregating species occurrence data
In this paper we describe a method of aggregating species occurrence data into what we coined “occurrence cubes”. The aggregated data can be perceived as a cube with three dimensions - taxonomic, temporal and geographic - and takes into account the spatial uncertainty of each occurrence. The aggregation level of each of the three dimensions can be adapted to the scope. Built on Open Science principles, the method is easily automated and reproducible, and can be used for species trend indicators, maps and distribution models. We are using the method to aggregate species occurrence data for Europe per taxon, year and 1km2 European reference grid, to feed indicators and risk mapping/modelling for the Tracking Invasive Alien Species (TrIAS) project
DNA barcoding and taxonomy: dark taxa and dark texts
Both classical taxonomy and DNA barcoding are engaged in the task of digitizing the living world. Much of the taxonomic literature remains undigitized. The rise of open access publishing this century and the freeing of older literature from the shackles of copyright have greatly increased the online availability of taxonomic descriptions, but much of the literature of the mid- to late-twentieth century remains offline (‘dark texts’). DNA barcoding is generating a wealth of computable data that in many ways are much easier to work with than classical taxonomic descriptions, but many of the sequences are not identified to species level. These ‘dark taxa’ hamper the classical method of integrating biodiversity data, using shared taxonomic names. Voucher specimens are a potential common currency of both the taxonomic literature and sequence databases, and could be used to help link names, literature and sequences. An obstacle to this approach is the lack of stable, resolvable specimen identifiers. The paper concludes with an appeal for a global ‘digital dashboard’ to assess the extent to which biodiversity data are available online.
This article is part of the themed issue ‘From DNA barcodes to biomes’
Range expansion of Ambrosia artemisiifolia in Europe is promoted by climate change
Ambrosia artemisiifolia L., native to North America, is a problematic invasive species, because of its highly allergenic pollen. The species is expected to expand its range due to climate change. By means of ecological niche modelling (ENM), we predict habitat suitability for A. artemisiifolia in Europe under current and future climatic conditions. Overall, we compared the performance and results of 16 algorithms commonly applied in ENM. As occurrence records of invasive species may be dominated by sampling bias, we also used data from the native range. To assess the quality of the modelling approaches we assembled a new map of current occurrences of A. artemisiifolia in Europe. Our results show that ENM yields a good estimation of the potential range of A. artemisiifolia in Europe only when using the North American data. A strong sampling bias in the European Global Biodiversity Information Facility (GBIF) data for A. artemisiifolia causes unrealistic results. Using the North American data reflects the realized European distribution very well. All models predict an enlargement and a northwards shift of potential range in Central and Northern Europe during the next decades. Climate warming will lead to an increase and northwards shift of A. artemisiifolia in Europe
Insights on the role of forest cover and on the changes in forest cover on thirty-five endangered mammal species distributions
The changes in forest cover can determine the survival of terrestrial endangered mammal species in the wild. This study assessed the impacts of forest cover changes on endangered mammal species distribution at global scale aiming to understand how the changes in forest cover may have impacted the distributions of 35 endangered small and large-body terrestrial mammals. There were used forest data obtained from time-series analyses of Landsat images between 2000 and 2014, species occurrence records collected by observations between 2000 and 2015 of Global Biodiversity Information Facility and species range data of International Union for Nature Conservation (IUCN) of the year 2015, to test the ‘natural and resource conditions’ hypothesis. Hypothesis on ‘natural and resource conditions’ produced models with high prediction accuracy of above 70 percent for 88 percent of 35 species models. The changes in forest cover explained species occurrences in 10 percent of all species models. In average, 59 percent of species occurrence records overlapped with species range data. The 51 percent of all species had no occurrence records between 2000 and 2015. Species and forest data collection as well as transnational cooperation for conservation of species roaming in the wild in upland forested areas and in cross-border areas may be critical for endangered mammal species conservation
Community next steps for making globally unique identifiers work for biocollections data
Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided
Connecting species’ geographical distributions to environmental variables: range maps versus observed points of occurrence
Connecting the geographical occurrence of a species with underlying environmental variables is fundamental for many analyses of life history evolution and for modeling species distributions for both basic and practical ends. However, raw distributional information comes principally in two forms: points of occurrence (specific geographical coordinates where a species has been observed), and expert-prepared range maps. Each form has potential short-comings: range maps tend to overestimate the true occurrence of a species, whereas occurrence points (because of their frequent non-random spatial distribution) tend to underestimate it. Whereas previous comparisons of the two forms have focused on how they may differ when estimating species richness, less attention has been paid to the extent to which the two forms actually differ in their representation of a species’ environmental associations. We assess such differences using the globally distributed avian order Galliformes (294 species). For each species we overlaid range maps obtained from IUCN and point-of-occurrence data obtained from GBIF on global maps of four climate variables and elevation. Over all species, the median difference in distribution centroids was 234 km, and median values of all five environmental variables were highly correlated, although there were a few species outliers for each variable. We also acquired species’ elevational distribution mid-points (mid-point between minimum and maximum elevational extent) from the literature; median elevations from point occurrences and ranges were consistently lower (median −420 m) than mid-points. We concluded that in most cases occurrence points were likely to produce better estimates of underlying environmental variables than range maps, although differences were often slight. We also concluded that elevational range mid-points were biased high, and that elevation distributions based on either points or range maps provided better estimates
- …
