6 research outputs found

    Use of Optional Data Curation Features by Users of Harvard Dataverse Repository

    Get PDF
    Objective: Investigate how different groups of depositors vary in their use of optional data curation features that provide support for FAIR research data in the Harvard Dataverse repository. Methods: A numerical score based upon the presence or absence of characteristics associated with the use of optional features was assigned to each of the 29,295 datasets deposited in Harvard Dataverse between 2007 and 2019. Statistical analyses were performed to investigate patterns of optional feature use amongst different groups of depositors and their relationship to other dataset characteristics. Results: Members of groups make greater use of Harvard Dataverse\u27s optional features than individual researchers. Datasets that undergo a data curation review before submission to Harvard Dataverse, are associated with a publication, or contain restricted files also make greater use of optional features. Conclusions: Individual researchers might benefit from increased outreach and improved documentation about the benefits and use of optional features to improve their datasets\u27 level of curation beyond the FAIR-informed support that the Harvard Dataverse repository provides by default. Platform designers, developers, and managers may also use the numerical scoring approach to explore how different user groups use optional application features

    Use of Optional Data Curation Features by Users of Harvard Dataverse Repository

    No full text
    Objective: Investigate how different groups of depositors vary in their use of optional data curation features that provide support for FAIR research data in the Harvard Dataverse repository. Methods: A numerical score based upon the presence or absence of characteristics associated with the use of optional features was assigned to each of the 29,295 datasets deposited in Harvard Dataverse between 2007 and 2019. Statistical analyses were performed to investigate patterns of optional feature use amongst different groups of depositors and their relationship to other dataset characteristics. Results: Members of groups make greater use of Harvard Dataverse's optional features than individual researchers. Datasets that undergo a data curation review before submission to Harvard Dataverse, are associated with a publication, or contain restricted files also make greater use of optional features. Conclusions: Individual researchers might benefit from increased outreach and improved documentation about the benefits and use of optional features to improve their datasets' level of curation beyond the FAIR-informed support that the Harvard Dataverse repository provides by default. Platform designers, developers, and managers may also use the numerical scoring approach to explore how different user groups use optional application features

    Towards a domain ontology for data assemblages

    No full text
    Critical data studies (CDS) is an interdisciplinary research area concerned with the critical, systematic investigation of sociotechnical infrastructures involving data, called data assemblages. CDS scholars have expressed a desire for more empirical studies that compare data assemblages, trace their change over time, and that offer insights to inform their design. This poster describes the development of research infrastructure to support these studies: a prototype, extensible domain ontology and glossary based upon Manuel DeLanda's neoassemblage theory (NAT). These knowledge representation tools are intended to consolidate and codify shared knowledge about assemblage theory and ultimately to enable researchers to describe, model, and compare assemblages and their topologies. The prototype NAT ontology and glossary were developed using a lightweight version of the Unified Process for Ontology building (UPONLite). Future work will involve ex-tending the NAT ontology to support data assemblage concepts and relationships using a method similar to Grounded Ontology (GO). It is anticipated that scholars may also use these two tools to support work involving other types of assemblages or use the ontology construction method to develop an ontological model of their preferred social theory

    Towards a domain ontology for data assemblages

    No full text
    Critical data studies (CDS) is an interdisciplinary research area concerned with the critical, systematic investigation of sociotechnical infrastructures involving data, called data assemblages. CDS scholars have expressed a desire for more empirical studies that compare data assemblages, trace their change over time, and that offer insights to inform their design. This poster describes the development of research infrastructure to support these studies: a prototype, extensible domain ontology and glossary based upon Manuel DeLanda's neoassemblage theory (NAT). These knowledge representation tools are intended to consolidate and codify shared knowledge about assemblage theory and ultimately to enable researchers to describe, model, and compare assemblages and their topologies. The prototype NAT ontology and glossary were developed using a lightweight version of the Unified Process for Ontology building (UPONLite). Future work will involve ex-tending the NAT ontology to support data assemblage concepts and relationships using a method similar to Grounded Ontology (GO). It is anticipated that scholars may also use these two tools to support work involving other types of assemblages or use the ontology construction method to develop an ontological model of their preferred social theory

    Mitigating Distributed Denial of Service Attacks with Dynamic Resource Pricing

    No full text
    Distributed Denial of Service (DDoS) attacks exploit the acute imbalance between client and server workloads to cause devastation to the service providers. We propose a distributed gateway architecture and a payment protocol that imposes dynamically changing prices on both network, server, and information resources in order to push some cost of initiating service requests --- in terms of monetary payments and/or computational burdens --- back onto the requesting clients. By employing different price and purchase functions, the architecture can provide service quality differentiation and furthermore, select good client behavior and discriminate against adversarial behavior. If confirmed by additional experiments, judicious partitioning of resources using different pricing functions can improve overall service survivability

    Repository Approaches to Improving the Quality of Shared Data and Code

    No full text
    Sharing data and code for reuse has become increasingly important in scientific work over the past decade. However, in practice, shared data and code may be unusable, or published results obtained from them may be irreproducible. Data repository features and services contribute significantly to the quality, longevity, and reusability of datasets. This paper presents a combination of original and secondary data analysis studies focusing on computational reproducibility, data curation, and gamified design elements that can be employed to indicate and improve the quality of shared data and code. The findings of these studies are sorted into three approaches that can be valuable to data repositories, archives, and other research dissemination platforms
    corecore