Search CORE

11,854 research outputs found

On the persistence of supplementary resources in biomedical publications

Author: C Santos
GA Petsko
JD Wren
Nicholas R Anderson
Peter Tarczy-Hornoch
Roger E Bumgarner
SCFGEPDFGNFKKAGL Lawrence
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Providing for long-term and consistent public access to scientific data is a growing concern in biomedical research. One aspect of this problem can be demonstrated by evaluating the persistence of supplementary data associated with published biomedical papers. METHODS: We manually evaluated 655 supplementary data links extracted from PubMed abstracts published 1998–2005 (Method 1) as well as a further focused subset of 162 full-text manuscripts published within three representative high-impact biomedical journals between September and December 2004 (Method 2). RESULTS: For Method 1 we found that since 2001, only 71 – 92% of supplementary data were still accessible via the links provided, with 93% of these inaccessible links occurring where supplementary data was not stored with the publishing journal. Of the manuscripts evaluated in Method 2, we found that only 83% of these links were available approximately a year after publication, with 55% of these inaccessible links were at locations outside the journal of publication. CONCLUSION: We conclude that if supplemental data is required to support the publication, journals policies must take-on the responsibility to accept and store such data or require that it be maintained with a credible independent institution or under the terms of a strategic data storage plan specified by the authors. We further recommend that publishers provide automated systems to ensure that supplementary links remain persistent, and that granting bodies such as the NIH develop policies and funding mechanisms to maintain long-term persistent access to these data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The eDAL Suite: Tools and Concepts for Primary Data Citation

Author: Christian Colmsee
Jinbo Chen
Matthias Klapperstück
Matthias Lange
Matthias Lange
Steffen Flemming
Uwe Scholz
Publication venue
Publication date: 01/01/2010
Field of study

Retrieval and citation of primary data is the important factor in the approaching age of “data science”. Digital data are easily shared, and just as easily wiped or lost. The problem of keeping on-line data accessible and
retrievable is especially difficult for SME like plant breeders plant biotech companies as well as research projects in this domain.
Intension of eDAL is the provisioning of an information retrieval and data citation infrastructure that meets the requirements of the “data science” age and implements a re-usable platform for data retrieval, data
citation, and data publication. Like a shopping cart, the idea is to combine a search engine and a data cart, which retrieves, rank and collect query relevant data from crop plant data centers

Crossref

Nature Precedings

Biodiversity informatics: the challenge of linking data and the role of shared identifiers

Author: Altschul
Dellavalle
Martin
Moreau
Ouellette
Page
Patterson
R. D. M. Page
Saux
Smith
Stein
Zamors'ky
Publication venue
Publication date: 01/01/2008
Field of study

A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers (such as DOIs and LSIDs), and the implementation of services that link those identifiers

Crossref

Enlighten

Nature Precedings

Theory and Practice of Data Citation

Author: Silvello Gianmaria
Publication venue: 'Wiley'
Publication date: 24/06/2017
Field of study

Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

Horizontal Integration of Warfighter Intelligence Data: A Shared Semantic Resource for the Intelligence Community

Author: Fu Chia
Malyuta Tatiana
Mandrick William S.
Parent Kesny
Patel Milan
Smith Barry
Publication venue
Publication date: 01/01/2012
Field of study

We describe a strategy that is being used for the horizontal integration of warfighter intelligence data within the framework of the US Army’s Distributed Common Ground System Standard Cloud (DSC) initiative. The strategy rests on the development of a set of ontologies that are being incrementally applied to bring about what we call the ‘semantic enhancement’ of data models used within each intelligence discipline. We show how the strategy can help to overcome familiar tendencies to stovepiping of intelligence data, and describe how it can be applied in an agile fashion to new data resources in ways that address immediate needs of intelligence analysts

PhilPapers

CiteSeerX

Practices, Policies, and Persistence: A Study of Supplementary Materials in Crop Science Journals

Author: Williams Sarah C.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2016
Field of study

This study compared practices and policies of 24 crop science journals to selected NISO/NFAIS recommendations for online supplemental journal article materials. The studied recommendations include the display of supplementary materials, DOIs for supplementary materials, and clear preservation statements regarding supplementary materials. This study also investigated missing supplementary materials on 18 of the journal websites. The findings reveal some potential roles for librarians and libraries, especially those with institutional repositories, which could better facilitate long-term access, data citation, and data reuse.Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

FigShare

Recommended from our members

Open science and modified funding lotteries can impede the natural selection of bad science.

Author: Contreras Kallens Pablo A
Smaldino Paul E
Turner Matthew A
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

Assessing scientists using exploitable metrics can lead to the degradation of research methods even without any strategic behaviour on the part of individuals, via 'the natural selection of bad science.' Institutional incentives to maximize metrics like publication quantity and impact drive this dynamic. Removing these incentives is necessary, but institutional change is slow. However, recent developments suggest possible solutions with more rapid onsets. These include what we call open science improvements, which can reduce publication bias and improve the efficacy of peer review. In addition, there have been increasing calls for funders to move away from prestige- or innovation-based approaches in favour of lotteries. We investigated whether such changes are likely to improve the reproducibility of science even in the presence of persistent incentives for publication quantity through computational modelling. We found that modified lotteries, which allocate funding randomly among proposals that pass a threshold for methodological rigour, effectively reduce the rate of false discoveries, particularly when paired with open science improvements that increase the publication of negative results and improve the quality of peer review. In the absence of funding that targets rigour, open science improvements can still reduce false discoveries in the published literature but are less likely to improve the overall culture of research practices that underlie those publications

eScholarship - University of California

Community next steps for making globally unique identifiers work for biocollections data

Author: Agosti Donat
Catapano Terry
Cellinese Nico
Deck John
Guralnick Robert P.
Hagedorn Gregor
Kunze John
Page Roderic D.
Penev Lyubomir
Pyle Richard L.
Walls Ramona
Wieczorek John
Publication venue: Pensoft Publishers
Publication date: 01/01/2015
Field of study

Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided

Crossref

Biodiversity Heritage Library OAI Repository

ZENODO

Directory of Open Access Journals

PubMed Central

Enlighten

ARPHA OAI-PMH Endpoint

ARPHA Preprints