Search CORE

4,299 research outputs found

A model of provenance applied to biodiversity datasets

Author: Amanqui Flor K
De Nies Tom
Dimou Anastasia
Mannens Erik
Moreira Dilvan
Van de Walle Rik
Verborgh Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Nowadays, the Web has become one of the main sources of biodiversity information. An increasing number of biodiversity research institutions add new specimens and their related information to their biological collections and make this information available on the Web. However, mechanisms which are currently available provide insufficient provenance of biodiversity information. In this paper, we propose a new biodiversity provenance model extending the W3C PROV Data Model. Biodiversity data is mapped to terms from relevant ontologies, such as Dublin Core and GeoSPARQL, stored in triple stores and queried using SPARQL endpoints. Additionally, we provide a use case using our provenance model to enrich collection data

Crossref

Ghent University Academic Bibliography

Universidade de São Paulo

Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines

Author: Aloia N.
Beran B.
Borgman C.L.
Carlson J.
Fielding N.G.
Honor L.B.
Ingwersen P.
Maier D.
Meyer E.T.
Pasquetto I.V.
Zimmerman A.S.
Publication venue: 'Wiley'
Publication date: 03/04/2019
Field of study

A cross-disciplinary examination of the user behaviours involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data. Two analytical frameworks rooted in information retrieval and science technology studies are used to identify key similarities in practices as a first step toward developing a model describing data retrieval

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Theory and Practice of Data Citation

Author: Silvello Gianmaria
Publication venue: 'Wiley'
Publication date: 24/06/2017
Field of study

Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

Community next steps for making globally unique identifiers work for biocollections data

Author: Agosti Donat
Catapano Terry
Cellinese Nico
Deck John
Guralnick Robert P.
Hagedorn Gregor
Kunze John
Page Roderic D.
Penev Lyubomir
Pyle Richard L.
Walls Ramona
Wieczorek John
Publication venue: Pensoft Publishers
Publication date: 01/01/2015
Field of study

Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided

Crossref

Biodiversity Heritage Library OAI Repository

ZENODO

Directory of Open Access Journals

PubMed Central

Enlighten

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Enhancing Workflow with a Semantic Description of Scientific Intent

Author: Edwards Peter
Gotts Nick
Pignotti Edoardo
Polhill Gary
Publication venue: 'Elsevier BV'
Publication date: 10/05/2011
Field of study

Peer reviewedPreprin

Aberdeen University Research

Crossref

Predicting provenance of forensic soil samples:linking soil to ecological habitats by metabarcoding and supervised classification

Author: Brunbjerg Ane Kirstine
Bruun Hans Henrik
Ejrnæs Rasmus
Fløjgaard Camilla
Frøslev Tobias Guldberg
Hansen Anders Johannes
Moeslund Jesper
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Environmental DNA (eDNA) is increasingly applied in ecological studies, including studies with the primary purpose of criminal investigation, in which eDNA from soil can be used to pair samples or reveal sample provenance. We collected soil eDNA samples as part of a large national biodiversity research project across 130 sites in Denmark. We investigated the potential for soil eDNA metabarcoding in predicting provenance in terms of environmental conditions, habitat type and geographic regions. We used linear regression for predicting environmental gradients of light, soil moisture, pH and nutrient status (represented by Ellenberg Indicator Values, EIVs) and Quadratic Discriminant Analysis (QDA) to predict habitat type and geographic region. eDNA data performed relatively well as a predictor of environmental gradients (R2 > 0.81). Its ability to discriminate between habitat types was variable, with high accuracy for certain forest types and low accuracy for heathland, which was poorly predicted. Geographic region was also less accurately predicted by eDNA. We demonstrated the application of provenance prediction in forensic science by evaluating and discussing two mock crime scenes. Here, we listed the plant species from annotated sequences, which can further aid in identifying the likely habitat or, in case of rare species, a geographic region. Predictions of environmental gradients and habitat types together give an overall accurate description of a crime scene, but care should be taken when interpreting annotated sequences, e.g. due to erroneous assignments in GenBank. Our approach demonstrates that important habitat properties can be derived from soil eDNA, and exemplifies a range of potential applications of eDNA in forensic ecology

Directory of Open Access Journals

Copenhagen University Research Information System