104 research outputs found

    The Scientific Drilling Database (SDDB) - Data from Deep Earth Monitoring and Sounding

    Get PDF
    Projects in the International Scientific Continental Drilling Program (ICDP) produce large amounts of data. Since the start of ICDP, data sharing has played an important part in ICDP projects, and the ICDP Operational Support Group, which provides the infrastructure for data capturing for many ICDP projects, has facilitated dissemination of data within project groups. However, unless published in journalpapers or books the data themselves in most cases were not available outside of the respective projects (see Conze et al. 2007, p. 32 this issue). With the online Scientific Drilling Database (SDDB; http://www.scientificdrilling.org), ICDP and GeoForschungsZentrum Potsdam (GFZ), Germany created a platform for the public dissemination of drilling data

    From Ions to Bits – Managing Data in a National Research Centre

    Get PDF
    From Ions to Bits.Managing data in active research projects is a challenging task. The innovative nature of research requires a flexible data infrastructure that is able to adapt to ad-hoc changes. How can this be reconciled with the necessity to streamline infrastructure services in order to keep cost at a sustainable level? What must data management services look like to integrate well into the everyday work of a researcher? In the past the focus of attention has been on large volume research data. However, most research data is small and complex, already highly enriched with contextual information. Managing this “long tail” of research data is labour-intensive and requires new strategies and technological solutions to allow sustainable operation. Eventually, the results of a project are published in the literature and should be accompanied by data publications. The data, now being part of the record of science, has to be citeable and has to be curated for a long period of time. Data publication and long-term preservation call for new services and for cooperation between infrastructure providers (computing centre) and memory institutions (library). 8 This talk will investigate the challenges and solutions for managing research data, taking research at GFZ as an example

    Assembly and concept of a web-based GIS within the paleolimnological project CONTINENT (Lake Baikal, Russia)

    Full text link
    Web-based Geographical Information Systems (GIS) are excellent tools within interdisciplinary and multi-national geoscience projects to exchange and visualize project data. The web-based GIS presented in this paper was designed for the paleolimnological project 'High-resolution CONTINENTal paleoclimate record in Lake Baikal' (CONTINENT) (Lake Baikal, Siberia, Russia) to allow the interactive handling of spatial data. The GIS database combines project data (core positions, sample positions, thematic maps) with auxiliary spatial data sets that were downloaded from freely available data sources on the world wide web. The reliability of the external data was evaluated and suitable new spatial datasets were processed according to the scientific questions of the project. GIS analysis of the data was used to assist studies on sediment provenance in Lake Baikal, or to help answer questions such as whether the visualization of present-day vegetation distribution and pollen distribution supports the conclusions derived from palynological analyses. The refined geodata are returned back to the scientific community by using online data publication portals. Data were made citeable by assigning persistent identifiers (DOI) and were published through the German National Library for Science and Technology (TIB Hannover, Hannover, Germany).Continen

    Langzeitarchivierung von Forschungsdaten : eine Bestandsaufnahme

    Get PDF
    The relevance of research data today and for the future is well documented and discussed, in Germany as well as internationally. Ensuring that research data are accessible, sharable, and re-usable over time is increasingly becoming an essential task for researchers and research infrastructure institutions. Some reasons for this development include the following: - research data are documented and could therefore be validated - research data could be the basis for new research questions - research data could be re-analyzed by using innovative digital methods - research data could be used by other disciplines Therefore, it is essential that research data are curated, which means they are kept accessible and interpretable over time. In Germany, a baseline study was undertaken analyzing the situation in eleven research disciplines in 2012. The results were then published in a German-language edition. To address an international audience, the German-language edition of the study has been translated and abridged

    Updating the Data Curation Continuum

    Get PDF
    The Data Curation Continuum was developed as a way of thinking about data repository infrastructure. Since its original development over a decade ago, a number of things have changed in the data infrastructure domain. This paper revisits the thinking behind the original data curation continuum and updates it to respond to changes in research objects, storage models, and the repository landscape in general. &nbsp

    Versioning data is about more than revisions : A conceptual framework and proposed principles

    Get PDF
    A dataset, small or big, is often changed to correct errors, apply new algorithms, or add new data (e.g., as part of a time series), etc. In addition, datasets might be bundled into collections, distributed in different encodings or mirrored onto different platforms. All these differences between versions of datasets need to be understood by researchers who want to cite the exact version of the dataset that was used to underpin their research. Failing to do so reduces the reproducibility of research results. Ambiguous identification of datasets also impacts researchers and data centres who are unable to gain recognition and credit for their contributions to the collection, creation, curation and publication of individual datasets. Although the means to identify datasets using persistent identifiers have been in place for more than a decade, systematic data versioning practices are currently not available. In this work, we analysed 39 use cases and current practices of data versioning across 33 organisations. We noticed that the term ‘version’ was used in a very general sense, extending beyond the more common understanding of ‘version’ to refer primarily to revisions and replacements. Using concepts developed in software versioning and the Functional Requirements for Bibliographic Records (FRBR) as a conceptual framework, we developed six foundational principles for versioning of datasets: Revision, Release, Granularity, Manifestation, Provenance and Citation. These six principles provide a high-level framework for guiding the consistent practice of data versioning and can also serve as guidance for data centres or data providers when setting up their own data revision and version protocols and procedures.Peer reviewe

    Distributed Persistent Identifiers System Design

    Get PDF
    The need to identify both digital and physical objects is ubiquitous in our society. Past and present persistent identifier (PID) systems, of which there is a great variety in terms of technical and social implementation, have evolved with the advent of the Internet, which has allowed for globally unique and globally resolvable identifiers. PID systems have, by in large, catered for identifier uniqueness, integrity, and persistence, regardless of the identifier’s application domain. Trustworthiness of these systems has been measured by the criteria first defined by BĂŒtikofer (2009) and further elaborated by Golodoniuc 'et al'. (2016) and Car 'et al'. (2017). Since many PID systems have been largely conceived and developed by a single organisation they faced challenges for widespread adoption and, most importantly, the ability to survive change of technology. We believe that a cause of PID systems that were once successful fading away is the centralisation of support infrastructure – both organisational and computing and data storage systems. In this paper, we propose a PID system design that implements the pillars of a trustworthy system – ensuring identifiers’ independence of any particular technology or organisation, implementation of core PID system functions, separation from data delivery, and enabling the system to adapt for future change. We propose decentralisation at all levels — persistent identifiers and information objects registration, resolution, and data delivery — using Distributed Hash Tables and traditional peer-to-peer networks with information replication and caching mechanisms, thus eliminating the need for a central PID data store. This will increase overall system fault tolerance thus ensuring its trustworthiness. We also discuss important aspects of the distributed system’s governance, such as the notion of the authoritative source and data integrit

    Making Research Data Repositories Visible: The re3data.org Registry

    Get PDF
    Researchers require infrastructures that ensure a maximum of accessibility, stability and reliability to facilitate working with and sharing of research data. Such infrastructures are being increasingly summarized under the term Research Data Repositories (RDR). The project re3data.org-Registry of Research Data Repositories-has begun to index research data repositories in 2012 and offers researchers, funding organizations, libraries and publishers an overview of the heterogeneous research data repository landscape. In July 2013 re3data.org lists 400 research data repositories and counting. 288 of these are described in detail using the re3data.org vocabulary. Information icons help researchers to easily identify an adequate repository for the storage and reuse of their data. This article describes the heterogeneous RDR landscape and presents a typology of institutional, disciplinary, multidisciplinary and project-specific RDR. Further the article outlines the features of re3data.org, and shows how this registry helps to identify appropriate repositories for storage and search of research data
    • 

    corecore