2,252 research outputs found

    Chemical information matters: an e-Research perspective on information and data sharing in the chemical sciences

    No full text
    Recently, a number of organisations have called for open access to scientific information and especially to the data obtained from publicly funded research, among which the Royal Society report and the European Commission press release are particularly notable. It has long been accepted that building research on the foundations laid by other scientists is both effective and efficient. Regrettably, some disciplines, chemistry being one, have been slow to recognise the value of sharing and have thus been reluctant to curate their data and information in preparation for exchanging it. The very significant increases in both the volume and the complexity of the datasets produced has encouraged the expansion of e-Research, and stimulated the development of methodologies for managing, organising, and analysing "big data". We review the evolution of cheminformatics, the amalgam of chemistry, computer science, and information technology, and assess the wider e-Science and e-Research perspective. Chemical information does matter, as do matters of communicating data and collaborating with data. For chemistry, unique identifiers, structure representations, and property descriptors are essential to the activities of sharing and exchange. Open science entails the sharing of more than mere facts: for example, the publication of negative outcomes can facilitate better understanding of which synthetic routes to choose, an aspiration of the Dial-a-Molecule Grand Challenge. The protagonists of open notebook science go even further and exchange their thoughts and plans. We consider the concepts of preservation, curation, provenance, discovery, and access in the context of the research lifecycle, and then focus on the role of metadata, particularly the ontologies on which the emerging chemical Semantic Web will depend. Among our conclusions, we present our choice of the "grand challenges" for the preservation and sharing of chemical information

    Variables As Currency: Linking Meta-Analysis Research and Data Paths in Sciences

    Get PDF
    Meta-analyses are studies that bring together data or results from multiple independent studies to produce new and over-arching findings. Current data curation systems only partially support meta-analytic research. Some important meta-analytic tasks, such as the selection of relevant studies for review and the integration of research datasets or findings, are not well supported in current data curation systems. To design tools and services that more fully support meta-analyses, we need a better understanding of meta-analytic research. This includes an understanding of both the practices of researchers who perform the analyses and the characteristics of the individual studies that are brought together. In this study, we make an initial contribution to filling this gap by developing a conceptual framework linking meta-analyses with data paths represented in published articles selected for the analysis. The framework focuses on key variables that represent primary/secondary datasets or derived socio-ecological data, contexts of use, and the data transformations that are applied. We introduce the notion of using variables and their relevant information (e.g., metadata and variable relationships) as a type of currency to facilitate synthesis of findings across individual studies and leverage larger bodies of relevant source data produced in small science research. Handling variables in this manner provides an equalizing factor between data from otherwise disparate data-producing communities. We conclude with implications for exploring data integration and synthesis issues as well as system development

    Research Data Management Practices And Impacts on Long-term Data Sustainability: An Institutional Exploration

    Get PDF
    With the \u27data deluge\u27 leading to an institutionalized research environment for data management, U.S. academic faculty have increasingly faced pressure to deposit research data into open online data repositories, which, in turn, is engendering a new set of practices to adapt formal mandates to local circumstances. When these practices involve reorganizing workflows to align the goals of local and institutional stakeholders, we might call them \u27data articulations.\u27 This dissertation uses interviews to establish a grounded understanding of the data articulations behind deposit in 3 studies: (1) a phenomenological study of genomics faculty data management practices; (2) a grounded theory study developing a theory of data deposit as articulation work in genomics; and (3) a comparative case study of genomics and social science researchers to identify factors associated with the institutionalization of research data management (RDM). The findings of this research offer an in-depth understanding of the data management and deposit practices of academic research faculty, and surfaced institutional factors associated with data deposit. Additionally, the studies led to a theoretical framework of data deposit to open research data repositories. The empirical insights into the impacts of institutionalization of RDM and data deposit on long-term data sustainability update our knowledge of the impacts of increasing guidelines for RDM. The work also contributes to the body of data management literature through the development of the data articulation framework which can be applied and further validated by future work. In terms of practice, the studies offer recommendations for data policymakers, data repositories, and researchers on defining strategies and initiatives to leverage data reuse and employ computational approaches to support data management and deposit

    6th Data Science Symposium Abstracts

    Get PDF
    The Data Science Symposium at Haus der Wissenschaft on 8/9 November 2021 in Bremen was the 6th Symposium in this series since 2017

    Scaling by Optimising: Modularisation of Data Curation Services in Growing Organisations

    Get PDF
    After a century of theorising and applying management practices, we are in the middle of entering a new stage in management science: digital management. The management of digital data submerges in traditional functions of management and, at the same time, continues to recreate viable solutions and conceptualisations in its established fields, e.g. research data management. Yet, one can observe bilateral synergies and mutual enrichment of traditional and data management practices in all fields. The paper at hand addresses a case in point, in which new and old management practices amalgamate to meet a steadily, in part characterised by leaps and bounds, increasing demand of data curation services in academic institutions. The idea of modularisation, as known from software engineering, is applied to data curation workflows so that economies of scale and scope can be used. While scaling refers to both management science and data science, optimising is understood in the traditional managerial sense, that is, with respect to the cost function. By means of a situation analysis describing how data curation services were applied from one department to the entire institution and an analysis of the factors of influence, a method of modularisation is outlined that converges to an optimal state of curation workflows

    PREDON Scientific Data Preservation 2014

    Get PDF
    LPSC14037Scientific data collected with modern sensors or dedicated detectors exceed very often the perimeter of the initial scientific design. These data are obtained more and more frequently with large material and human efforts. A large class of scientific experiments are in fact unique because of their large scale, with very small chances to be repeated and to superseded by new experiments in the same domain: for instance high energy physics and astrophysics experiments involve multi-annual developments and a simple duplication of efforts in order to reproduce old data is simply not affordable. Other scientific experiments are in fact unique by nature: earth science, medical sciences etc. since the collected data is "time-stamped" and thereby non-reproducible by new experiments or observations. In addition, scientific data collection increased dramatically in the recent years, participating to the so-called "data deluge" and inviting for common reflection in the context of "big data" investigations. The new knowledge obtained using these data should be preserved long term such that the access and the re-use are made possible and lead to an enhancement of the initial investment. Data observatories, based on open access policies and coupled with multi-disciplinary techniques for indexing and mining may lead to truly new paradigms in science. It is therefore of outmost importance to pursue a coherent and vigorous approach to preserve the scientific data at long term. The preservation remains nevertheless a challenge due to the complexity of the data structure, the fragility of the custom-made software environments as well as the lack of rigorous approaches in workflows and algorithms. To address this challenge, the PREDON project has been initiated in France in 2012 within the MASTODONS program: a Big Data scientific challenge, initiated and supported by the Interdisciplinary Mission of the National Centre for Scientific Research (CNRS). PREDON is a study group formed by researchers from different disciplines and institutes. Several meetings and workshops lead to a rich exchange in ideas, paradigms and methods. The present document includes contributions of the participants to the PREDON Study Group, as well as invited papers, related to the scientific case, methodology and technology. This document should be read as a "facts finding" resource pointing to a concrete and significant scientific interest for long term research data preservation, as well as to cutting edge methods and technologies to achieve this goal. A sustained, coherent and long term action in the area of scientific data preservation would be highly beneficial
    • …
    corecore