13 research outputs found

    Big Opportunities in Access to Small Science Data

    Get PDF
    A distributed infrastructure that would enable those who wish to do so to contribute their scientific or technical data to a universal digital commons could allow such data to be more readily preserved and accessible among disciplinary domains. Five critical issues that must be addressed in developing an efficient and effective data commons infrastructure are described. We conclude that creation of a distributed infrastructure meeting the critical criteria and deployable throughout the networked university library community is practically achievable

    The data paper: a mechanism to incentivize data publishing in biodiversity science

    Get PDF
    <p/> <p>Background</p> <p>Free and open access to primary biodiversity data is essential for informed decision-making to achieve conservation of biodiversity and sustainable development. However, primary biodiversity data are neither easily accessible nor discoverable. Among several impediments, one is a lack of incentives to data publishers for publishing of their data resources. One such mechanism currently lacking is recognition through conventional scholarly publication of enriched metadata, which should ensure rapid discovery of 'fit-for-use' biodiversity data resources.</p> <p>Discussion</p> <p>We review the state of the art of data discovery options and the mechanisms in place for incentivizing data publishers efforts towards easy, efficient and enhanced publishing, dissemination, sharing and re-use of biodiversity data. We propose the establishment of the 'biodiversity data paper' as one possible mechanism to offer scholarly recognition for efforts and investment by data publishers in authoring rich metadata and publishing them as citable academic papers. While detailing the benefits to data publishers, we describe the objectives, work flow and outcomes of the pilot project commissioned by the Global Biodiversity Information Facility in collaboration with scholarly publishers and pioneered by Pensoft Publishers through its journals <it>Zookeys</it>, <it>PhytoKeys</it>, <it>MycoKeys</it>, <it>BioRisk</it>, <it>NeoBiota</it>, <it>Nature Conservation</it> and the forthcoming <it>Biodiversity Data Journal</it>. We then debate further enhancements of the data paper beyond the pilot project and attempt to forecast the future uptake of data papers as an incentivization mechanism by the stakeholder communities.</p> <p>Conclusions</p> <p>We believe that in addition to recognition for those involved in the data publishing enterprise, data papers will also expedite publishing of fit-for-use biodiversity data resources. However, uptake and establishment of the data paper as a potential mechanism of scholarly recognition requires a high degree of commitment and investment by the cross-sectional stakeholder communities.</p

    Towards a data publishing framework for primary biodiversity data: challenges and potentials for the biodiversity informatics community

    Get PDF
    Background: Currently primary scientific data, especially that dealing with biodiversity, is neither easily discoverable nor accessible. Amongst several impediments, one is a lack of professional recognition of scientific data publishing efforts. A possible solution is establishment of a ‘Data Publishing Framework’ which would encourage and recognise investments and efforts by institutions and individuals towards management, and publishing of primary scientific data potentially on a par with recognitions received for scholarly publications. Discussion: This paper reviews the state-of-the-art of primary biodiversity data publishing, and conceptualises a ‘Data Publishing Framework’ that would help incentivise efforts and investments by institutions and individuals in facilitating free and open access to biodiversity data. It further postulates the institutionalisation of a ‘Data Usage Index (DUI)’, that would attribute due recognition to multiple players in the data collection/creation, management and publishing cycle. Conclusion: We believe that institutionalisation of such a ‘Data Publishing Framework’ that offers socio-cultural, legal, technical, economic and policy environment conducive for data publishing will facilitate expedited discovery and mobilisation of an exponential increase in quantity of ‘fit-for-use’ primary biodiversity data, much of which is currently invisible

    On the Communication of Scientific Results: The Full-Metadata Format

    Full text link
    In this paper, we introduce a scientific format for text-based data files, which facilitates storing and communicating tabular data sets. The so-called Full-Metadata Format builds on the widely used INI-standard and is based on four principles: readable self-documentation, flexible structure, fail-safe compatibility, and searchability. As a consequence, all metadata required to interpret the tabular data are stored in the same file, allowing for the automated generation of publication-ready tables and graphs and the semantic searchability of data file collections. The Full-Metadata Format is introduced on the basis of three comprehensive examples. The complete format and syntax is given in the appendix

    Publishing and pushing: Mixing models for communicating research data

    Get PDF
    We present a case study of data integration and reuse involving 12 researchers who published datasets in Open Context, an online data publishing platform, as part of collaborative archaeological research on early domesticated animals in Anatolia. Our discussion reports on how different editorial and collaborative review processes improved data documentation and quality, and created ontology annotations needed for comparative analyses by domain specialists. To prepare data for shared analysis, this project adapted editor-supervised review and revision processes familiar to conventional publishing, as well as more novel models of revision adapted from open source software development of public version control. Preparing the datasets for publication and analysis required significant investment of effort and expertise, including archaeological domain knowledge and familiarity with key ontologies. To organize this work effectively, we emphasized these different models of collaboration at various stages of this data publication and analysis project. Collaboration first centered on data editors working with data contributors, then widened to include other researchers who provided additional peer-review feedback, and finally the widest research community, whose collaboration is facilitated by GitHub’s version control system. We demonstrate that the “publish” and “push” models of data dissemination need not be mutually exclusive; on the contrary, they can play complementary roles in sharing high quality data in support of research. This work highlights the value of combining multiple models in different stages of data dissemination

    Publishing and Pushing: Mixing Models for Communicating Research Data in Archaeology

    Get PDF
    We present a case study of data integration and reuse involving 12 researchers who published datasets in Open Context, an online data publishing platform, as part of collaborative archaeological research on early domesticated animals in Anatolia. Our discussion reports on how different editorial and collaborative review processes improved data documentation and quality, and created ontology annotations needed for comparative analyses by domain specialists. To prepare data for shared analysis, this project adapted editor-supervised review and revision processes familiar to conventional publishing, as well as more novel models of revision adapted from open source software development of public version control. Preparing the datasets for publication and analysis required significant investment of effort and expertise, including archaeological domain knowledge and familiarity with key ontologies. To organize this work effectively, we emphasized these different models of collaboration at various stages of this data publication and analysis project. Collaboration first centered on data editors working with data contributors, then widened to include other researchers who provided additional peer-review feedback, and finally the widest research community, whose collaboration is facilitated by GitHub’s version control system. We demonstrate that the “publish” and “push” models of data dissemination need not be mutually exclusive; on the contrary, they can play complementary roles in sharing high quality data in support of research. This work highlights the value of combining multiple models in different stages of data dissemination

    Beyond Big or Little Science: Understanding Data Lifecycles in Astronomy and the Deep Subseafloor Biosphere

    Get PDF
    For decades, the big science and little science dichotomy has served as a starting point for many analyses of scientific research and data practices, including studies used to inform the construction of scientific knowledge infrastructures. We challenge this dichotomy by presenting findings from longitudinal, qualitative case studies of data lifecycles in two scientific domains, each centered around a large, distributed scientific collaboration. One is astronomy and the Sloan Digital Sky Survey (SDSS). The other is the deep subseafloor biosphere and the Center for Dark Energy Biosphere Investigations (C-DEBI). We show that some critical stages of the data lifecycle in each domain unfold in big science contexts while other critical stages occur in little science contexts. Furthermore, these big and little science contexts shape each other dynamically. This challenging of the big and little science dichotomy has implications for the building of scientific knowledge infrastructures, including those supporting data management.ye

    Toward a Commons of Geographic Data

    Get PDF
    Making scientific data openly accessible and available for re-use is desirable to encourage validation of research results, and/or economic development. A significant body of spatially-referenced, locally-produced data produced by individual researchers, non-profit groups, private associations, small companies, universities, and non-governmental organizations across the United States is not online and therefore not generally available to professional scientists and to the general public. If there were an online environment, a Commons of Geographic Data, where that data could be deposited or registered, and where users could access and re-use it, what infrastructure characteristics might potential contributors find desirable in order for them to be willing to contribute their data without monetary compensation; and what infrastructure characteristics might potential users find desirable in order for them to be willing to access, investigate, and use such contributed data? Based on data preservation literature, this study hypothesized three such potential characteristics as desirable. Using a combination of qualitative and quantitative methods, this study examined the desirability of these infrastructure capabilities in a non-statistical sample of potential contributors and potential users. The results of both the qualitative and quantitative research support the hypothesis. The results can provide guidance for those who may wish to design such a commons environment for locally-generated, spatially-referenced data in the future, and may also be of use to those that operate repositories of other types of data

    Scientific Digital Data Repositories: Needs and Challenges for Cancer Researchers

    Get PDF
    The purpose of this study is to understand the varied data needs of molecular level cancer researchers who use light, fluorescent, and electron microscopy to obtain knowledge about cancer on a molecular level. It explores what data tools a sample of researchers are currently using to preserve their data for future access, and the needs of these researchers for depositing their digital research data into digital repositories. Data from the researchers suggest that they understand the need to preserve their raw and compiled data in places outside their laboratory, but they have not fully embraced the idea of depositing it in a repository. This seems most likely due to them not fully understanding what repositories are and what they provide. To increase the use of repositories by this research community, repositories need to promote themselves better and to offer additional services that are specific for the needs of this community.Master of Science in Library Scienc
    corecore