23 research outputs found

    Committing to Data Quality Review

    Full text link

    Committing to Data Quality Review

    Get PDF
    Amid the pressure and enthusiasm for researchers to share data, a rapidly growing number of tools and services have emerged. What do we know about the quality of these data? Why does quality matter? And who should be responsible for data quality? We believe an essential measure of data quality is the ability to engage in informed reuse, which requires that data are independently understandable. In practice, this means that data must undergo quality review, a process whereby data and associated files are assessed and required actions are taken to ensure files are independently understandable for informed reuse. This paper explains what we mean by data quality review, what measures can be applied to it, and how it is practiced in three domain-specific archives. We explore a selection of other data repositories in the research data ecosystem, as well as the roles of researchers, academic libraries, and scholarly journals in regard to their application of data quality measures in practice. We end with thoughts about the need to commit to data quality and who might be able to take on those tasks

    Preparing to Share Social Science Data: An Open Source, DDI-based Curation System

    Get PDF
    Objective: This poster will describe the development of a curatorial system to support a repository for research data from randomized controlled trials in the social sciences. Description: The Institution for Social and Policy Studies (ISPS) at Yale University and Innovations for Poverty Action (IPA) are partnering with Colectica to develop a software platform that structures the curation workflow, including checking data for confidentiality and completeness, creating preservation formats, and reviewing and verifying code. The software leverages DDI Lifecycle – the standard for data documentation – and will enable a seamless framework for collecting, processing, archiving, and publishing data. This data curation software system combines several off-the-shelf components with a new, open source, Web application that integrates the existing components to create a flexible data pipeline. The software will help automate parts of the data pipeline and will unify the workflow for staff, and potentially for researchers. Default components include Fedora Commons, Colectica Repository, and Drupal, but the software is developed so each of these can be swapped for alternatives. Results: The software is designed to integrate into any repository workflow, and can also be incorporated earlier in the research workflow, ensuring eventual data and code deposits are of the highest quality. Conclusions: This poster will describe the requirements for the new curatorial workflow tool, the components of the system, how tasks are launched and tracked, and the benefits of building an integrated curatorial system for data, documentation, and code

    Red flags in data: Learning from failed data reuse experiences

    Get PDF
    This study examined the data reusers' failed or unsuccessful experience to understand what constituted reusers' failure. Learning from failed experiences is necessary to understand why the failure occurred and to prevent the failure or convert the failure to success. This study offers an alternative view on data reuse practices and provides insights for facilitating data reuse processes by eliminating core components of failure. From the interviews with 23 quantitative social science data reusers who had failed data reuse experiences, the study findings suggest: (a) ease of reuse, particularly the issue of access and interoperability, is the important initial condition for a successful data reuse experience; (b) understanding data through documentation may be less of an issue, at least for experienced researchers to make their data reuse unsuccessful, although the process can still be challenging; and (c) the major component of failed experience is the lack of support in reusing data, which emphasizes the need to develop a support system for data reusers

    YARD: A Tool for Curating Research Outputs

    Get PDF
    Repositories increasingly accept research outputs and associated artifacts that underlie reported findings, leading to potential changes in the demand for data curation and repository services. This paper describes a curation tool that responds to this challenge by economizing and optimizing curation efforts. The curation tool is implemented at Yale University’s Institution for Social and Policy Studies (ISPS) as YARD. By standardizing the curation workflow, YARD helps create high quality data packages that are findable, accessible, interoperable, and reusable (FAIR) and promotes research transparency by connecting the activities of researchers, curators, and publishers through a single pipeline

    Factors of trust in data reuse

    Get PDF
    Purpose The purpose of this paper is to quantitatively examine factors of trust in data reuse from the reusers’ perspectives. Design/methodology/approach This study utilized a survey method to test the proposed hypotheses and to empirically evaluate the research model, which was developed to examine the relationship each factor of trust has with reusers’ actual trust during data reuse. Findings This study found that the data producer (H1) and data quality (H3) were significant, as predicted, while scholarly community (H3) and data intermediary (H4) were not significantly related to reusers’ trust in data. Research limitations/implications Further disciplinary specific examinations should be conducted to complement the study findings and fully generalize the study findings. Practical implications The study finding presents the need for engaging data producers in the process of data curation, preferably beginning in the early stages and encouraging them to work with curation professionals to ensure data management quality. The study finding also suggests the need for re-defining the boundaries of current curation work or collaborating with other professionals who can perform data quality assessment that is related to scientific and methodological rigor. Originality/value By analyzing theoretical concepts in empirical research and validating the factors of trust, this study fills this gap in the data reuse literature

    METODE PENILAIAN KUALITAS DATA SEBAGAI REKOMENDASI SISTEM REPOSITORI ILMIAH NASIONAL

    Get PDF
    High quality data and data quality assessment which efficiently needed to data standardization in the research data repository. Three attributes most used i.e: completeness, accuracy, and timeliness are dimensions to data quality assessment. The purposes of the research are to increase knowledge and discuss in depth of research done. To support the research, we are using traditional review method on the Scopus database to identify relevant research. The literature review is limited for the type of documents i.e: articles, books, proceedings, and reviews. The result of document searching is filtered using some keywords i.e: data quality, data quality assessment, data quality dimensions, quality assessment, data accuracy, dan data completeness. The document that found be analyzed based on relevant research. Then, these documents compare to find out different of concept and method which used in the data quality metric. The result of analysis could be used as a recommendation to implement in the data quality assessment in the National Scientific Repository

    An Open Source, DDI-Based Data Curation System For Social Science Data

    Get PDF
    The Institution for Social and Policy Studies (ISPS) at Yale University and Innovations for Poverty Action (IPA) are partnering to develop a repository for research data from randomized controlled trials in the social sciences. The repository will be an expansion – and major upgrade – of the existing ISPS Data Archive. Together with Colectica, the partners are developing a software platform that leverages DDI Lifecycle, the standard for data documentation. The software structures the curation workflow, which also includes checking data for confidentiality and completeness, creating preservation formats, and reviewing and verifying code. The software will enable a seamless framework for collecting, processing, archiving, and publishing data. This data curation software system combines several off-the-shelf components with a new, open source, Web application that integrates the existing components to create a flexible data pipeline. The software helps automate parts of the data pipeline and unifies the workflow for staff. Default components include Fedora Commons, Colectica Repository, and Drupal, but the software is developed so each of these can be swapped for alternatives. This session will include a live demonstration of the data curation software.  &nbsp

    Connecting Researchers to Repositories IMLS Project Report

    Get PDF
    Abstract: Despite a general consensus that making research data available is beneficial to many stakeholders, data sharing/curation is still not performed as an integrated step in most research lifecycles or common practice in the academic setting. Given many efforts over the last several years, why aren’t repositories used more by researchers? This question was explored in two workshops meant to consider the next steps in developing the Data Curation Profiles (DCP) Toolkit. It identifies a unique approach to help efforts to increase data deposits in research data repositories from an entrepreneurial perspective

    Operationalizing the Replication Standard

    Get PDF
    In response to widespread concerns about the integrity of research published in scholarly journals, several initiatives have emerged that are promoting research transparency through access to data underlying published scientific findings. Journal editors, in particular, have made a commitment to research transparency by issuing data policies that require authors to submit their data, code, and documentation to data repositories to allow for public access to the data. In the case of the American Journal of Political Science (AJPS) Data Replication Policy, the data also must undergo an independent verification process in which materials are reviewed for quality as a condition of final manuscript publication and acceptance. Aware of the specialized expertise of the data archives, AJPS called upon the Odum Institute Data Archive to provide a data review service that performs data curation and verification of replication datasets. This article presents a case study of the collaboration between AJPS and the Odum Institute Data Archive to develop a workflow that bridges manuscript publication and data review processes. The case study describes the challenges and the successes of the workflow integration, and offers lessons learned that may be applied by other data archives that are considering expanding their services to include data curation and verification services to support reproducible research
    corecore