International Journal of Digital Curation
Not a member yet
533 research outputs found
Sort by
Data Showcases: the Data Journal in a Multimodal World
As an experiment, the Research Data Journal for the Humanities and Social Sciences (RDJ) has temporarily extended the usual format of the online journal with so-called ‘showcases’, separate web pages containing a quick introduction to a dataset, embedded multimedia, interactive components, and facilities to directly preview and explore the dataset described. The aim was to create a coherent hyper document with content communicated via different media (multimodality) and provide space for new forms of scientific publication such as executable papers (e.g. Jupyter notebooks). This paper discusses the objectives, technical implementations, and the need for innovation in data publishing considering the advanced possibilities of today's digital modes of communication. The data showcases experiment proved to be a useful starting point for an exploration of related developments within and outside the humanities and social sciences. It turns out that small-scale experiments are relatively easy to perform thanks to the easy availability of digital technology. However, real innovation in publishing affects organization and infrastructure and requires the joint effort of publishers, editors, data repositories, and authors. It implies a thorough update of the concept of publication and adaptation of the production process. This paper also pays attention to these obstacles to taking new paths
Fostering the Adoption of DMP in Small Research Projects through a Collaborative Approach
In order to promote sound management of research data the European Commission, under the Horizon 2020 framework program, is promoting the adoption of a Data Management Plan (DMP) in research projects. Despite the value of a DMP to make data findable, accessible, interoperable and reusable (FAIR) through time, the development and implementation of DMPs is not yet a common practice in health research. Raising the awareness of researchers in small projects to the benefits of early adoption of a DMP is, therefore, a motivator for others to follow suit. In this paper we describe an approach to engage researchers in the writing of a DMP, in an ongoing project, FrailSurvey, in which researchers are collecting data through a mobile application for self-assessment of fragility. The case study is supported by interviews, a metadata creation session, as well as the validation of recommendations by researchers. With the outline of our process we also outline tools and services that supported the development of the DMP in this small project, particularly since there were no institutional services available to researcher
Data Curation Strategies to Support Responsible Big Social Research and Big Social Data Reuse
Big social research repurposes existing data from online sources such as social media, blogs, or online forums, with a goal of advancing knowledge of human behavior and social phenomena. Big social research also presents an array of challenges that can prevent data sharing and reuse.
This brief report presents an overview of a larger study that aims to understand the data curation implications of big social research to support use and reuse of big social data. The study, which is based in the United States, identifies six key issues relating to big social research and big social data curation through a review of the literature. It then further investigates perceptions and practices relating to these six key issues through semi-structured interviews with big social researchers and data curators.
This report concludes with implications for data curation practice: metadata and documentation, connecting with researchers throughout the research process, data repository services, and advocating for community standards. Supporting responsible practices for using big social data can help scale up social science research, thus enhancing our understanding of human behavior and social phenomena
Curating for Accessibility
Accessibility of research data to disabled users has received scant attention in literature and practice. In this paper we briefly survey the current state of accessibility for research data and suggest some first steps that repositories should take to make their holdings more accessible. We then describe in depth how those steps were implemented at the Qualitative Data Repository (QDR), a domain repository for qualitative social-science data. The paper discusses accessibility testing and improvements on the repository and its underlying software, changes to the curation process to improve accessibility, as well as efforts to retroactively improve the accessibility of existing collections. We conclude by describing key lessons learned during this process as well as next steps
Towards Environmentally Sustainable Long-term Digital Preservation
ARCHIVER and Pre-Commercial Procurement funding has enabled small to medium enterprises (SMEs) to innovate and deliver new services for EOSC. Within the framework of the ARCHIVER pre-commercial procurement tender, between December 2020 and August 2021, three commercial consortia competed to deliver innovative, prototype solutions for long-term data preservation. Two of them were selected to continue with the pilot phase and deliver research-ready solutions for long-term data preservation of research data, therefore filling a gap in the current European Open Science panorama.
Digital preservation relies on technological infrastructure (information and communication technology, ICT) that can have environmental impacts. While altering technology usage can reduce the impact of digital preservation practices, this alone is not a strategy for sustainable practice. Moving toward environmentally sustainable digital preservation requires critically examining the motivations and assumptions that shape current practice. The use of scalable cloud infrastructures can reduce the environmental impacts of long-term data preservation solutions
Putting the R into PlatfoRms
This paper looks at the question of how and why to bring about greater reusability of Research Platforms (variously called Virtual Laboratories, Virtual Research Environments, or Science Gateways). It begins with some context for the Australian Research Data Commons, where the authors are based. It then examines the infrastructure concerns that are driving the need for platforms to be created and remain sustainable, and the connection from this to reusability. The paper then proceeds to discuss the ways in which FAIR is being extended to a range of research objects and infrastructure elements, before reviewing the work of the FAIR4VREs WG. The core of the paper is an examination, with examples or case studies, of four different paradigms for platform reusability: accessing, adopting, adapting, and abstracting. The paper concludes by examining actions undertaken by the ARDC to increase the likelihood of reusability.
 
Reusable, FAIR Humanities Data
While stakeholders including funding agencies and academic publishers implement more stringent data sharing policies, challenges remain for researchers in the humanities who are increasingly prompted to share their research data. This paper outlines some key challenges of research data sharing in the humanities, and identifies existing work which has been undertaken to explore these challenges. It describes the current landscape regarding publishers’ research data sharing policies, and the impact which strong data policies can have, regardless of discipline.
Using Routledge Open Research as a case study, the development of a set of humanities-inclusive Open Data publisher data guidelines is then described. These include practical guidance in relation to data sharing for humanities authors, and a close alignment with the FAIR Data Principles
From Siloed to Reusable
In the past twenty-five years, cross-institutional communities have come together in the creation and use of open source software and open data standards to build digital collections (Madden, 2012). These librarians, developers, archivists, artists, and researchers recognize that the custom-built architectures and bespoke data structures of earlier digital collections development are unsustainable. Their collaborations have produced now-standard technologies such as Samvera, Fedora, GeoBlacklight, Islandora 8, as well as RDF, and JSON-LD among other open schemas. A core principle animating these efforts is reusability: data, schemas, and technologies in the open era must be coherent and flexible enough to be reused across multiple digital contexts. The authors of this paper show how reuse guided the migration of the Hopkins Digital Library from an outdated isolated system to a sustainable interconnected environment in GeoBlacklight, Islandora, with metadata based in Linked Open Data. Three areas of reuse focus this paper: the creation of robust interoperable metadata; the expansion of IIIF functionality to integrate the needs of the Hopkins Geoportal’s users; the development of a broadly re/usable data migration module focused on expanding a diverse community of invested users. In focusing on reusability as an organising principle of digital collections development, this case study shows how one digital curation team produced a platform that meets the changing and specific needs of an individual institution, on the one hand, and participated in and furthered the creative coherence of the open communities supporting the team’s work, on the other
First Line Research Data Management for Life Sciences: a Case Study
Modern life sciences studies depend on the collection, management and analysis of comprehensive datasets in what has become data-intensive research. Life science research is also characterised by having relatively small groups of researchers. This combination of data-intensive research performed by a few people has led to an increasing bottleneck in research data management (RDM). Parallel to this, there has been an urgent call by initiatives like FAIR and Open Science to openly publish research data which has put additional pressure on improving the quality of RDM. Here, we reflect on the lessons learnt by DataHub Maastricht, a RDM support group of the Maastricht University Medical Centre (MUMC+) in Maastricht, the Netherlands, in providing first-line RDM support for life sciences. DataHub Maastricht operates with a small core team, and is complemented with disciplinary data stewards, many of whom have joint positions with DataHub and a research group. This organisational model helps creating shared knowledge between DataHub and the data stewards, including insights how to focus support on the most reusable datasets. This model has shown to be very beneficial given limited time and personnel. We found that co-hosting tailored platforms for specific domains, reducing storage costs by implementing tiered storage and promoting cross-institutional collaboration through federated authentication were all effective features to stimulate researchers to initiate RDM. Overall, utilising the expertise and communication channel of the embedded data stewards was also instrumental in our RDM success. Looking into the future, we foresee the need to further embed the role of data stewards into the lifeblood of the research organisation, along with policies on how to finance long-term storage of research data. The latter, to remain feasible, needs to be combined with a further formalising of appraisal and reappraisal of archived research data
DBRepo: a Semantic Digital Repository for Relational Databases
Data curation is a complex, multi-faceted task. While dedicated data stewards are starting to take care of these activities in close collaboration with researchers for many types of (usually file-based) data in many institutions, this is rarely yet the case for data held in relational databases. Beyond large-scale infrastructures hosting e.g. climate or genome data, researchers usually have to create, build and maintain their database, care about security patches, and feed data into it in order to use it in their research. Data curation, if at all, usually happens after a project is finished, when data may be exported for digital preservation into file repository systems.
We present DBRepo, a semantic digital repository for relational databases in a private cloud setting designed to (1) host research data stored in relational databases right from the beginning of a research project, (2) provide separation of concerns, allowing the researchers to focus on the domain aspects of the data and their work while bringing in experts to handle classic data management tasks, (3) improve findability, accessibility and reusability by offering semantic mapping of metadata attributes, and (4) focus on reproducibility in dynamically evolving data by supporting versioning and precise identification/cite-ability for arbitrary subsets of data.