International Journal of Digital Curation
Not a member yet
533 research outputs found
Sort by
Understanding the Data Management Plan as a Boundary Object through a Multi-stakeholder perspective
A three-phase Delphi study was used to investigate an emerging community for research data management in Norway and their understanding and application of data management plans (DMPs). The findings reveal visions of what the DMP should be as well as different practice approaches, yet the stakeholders present common goals. This paper discusses the different perspectives on the DMP by applying Star and Griesemer’s theory of boundary objects (Star & Griesemer, 1989). The debate on what the DMP is and the findings presented are relevant to all research communities currently implementing DMP procedures and requirements. The current discussions about DMPs tend to be distant from the active researchers and limited to the needs of funders and institutions rather than to the usefulness for researchers. By analysing the DMP as a boundary object, plastic and adaptable yet with a robust identity (Star & Griesemer, 1989), and by translating between worlds where collaboration on data sharing can take place we expand the perspectives and include all stakeholders. An understanding of the DMP as a boundary object can shift the focus from shaping a DMP which fulfils funders’ requirements to enabling collaboration on data management and sharing across domains using standardised forms
Metajelo: a Metadata Package for Journals to Support External Linked Objects
We propose a metadata package that is intended to provide academic journals with a lightweight means of registering, at the time of publication, the existence and disposition of supplementary materials. Information about the supplementary materials is, in most cases, critical for the reproducibility and replicability of scholarly results. In many instances, these materials are curated by a third party, which may or may not follow developing standards for the identification and description of those materials. As such, the vocabulary described here complements existing initiatives that specify vocabularies to describe the supplementary materials or the repositories and archives in which they have been deposited. Where possible, it reuses elements of relevant other vocabularies, facilitating coexistence with them. Furthermore, it provides an “at publication” record of reproducibility characteristics of a particular article that has been selected for publication. The proposed metadata package documents the key characteristics that journals care about in the case of supplementary materials that are held by third parties: existence, accessibility, and permanence. It does so in a robust, time-invariant fashion at the time of publication, when the editorial decisions are made. It also allows for better documentation of less accessible (non-public data), by treating it symmetrically from the point of view of the journal, therefore increasing the transparency of what up until now has been very opaque.
 
Doctoral Students' Educational Needs in Research Data Management: Perceived Importance and Current Competencies
Sound research data management (RDM) competencies are elementary tools used by researchers to ensure integrated, reliable, and re-usable data, and to produce high quality research results. In this study, 35 doctoral students and faculty members were asked to self-rate or rate doctoral students’ current RDM competencies and rate the importance of these competencies. Structured interviews were conducted, using close-ended and open-ended questions, covering research data lifecycle phases such as collection, storing, organization, documentation, processing, analysis, preservation, and data sharing. The quantitative analysis of the respondents’ answers indicated a wide gap between doctoral students’ rated/self-rated current competencies and the rated importance of these competencies. In conclusion, two major educational needs were identified in the qualitative analysis of the interviews: to improve and standardize data management planning, including awareness of the intellectual property and agreements issues affecting data processing and sharing; and to improve and standardize data documenting and describing, not only for the researcher themself but especially for data preservation, sharing, and re-using. Hence the study informs the development of RDM education for doctoral students
Futureproofing Visual Effects
Digital visual effects (VFX), including computer animation, have become a commonplace feature of contemporary episodic and film production projects. Using various commercial applications and bespoke tools, VFX artists craft digital objects (known as “assets”) to create visual elements such as characters and environments, which are composited together and output as shots.
While the shots that make up the finished film or television (TV) episode are maintained and preserved within purpose-built digital asset management systems and repositories by the studios commissioning the projects; the wider VFX network currently has no consistent guidelines nor requirements around the digital curation of VFX digital assets and records. This includes a lack of guidance about how to effectively futureproof digital VFX and preserve it for the long-term.
In this paper I provide a case study – a single shot from a 3D animation short film – to illustrate the complexities of digital VFX assets and records and the pipeline environments whence they are generated. I also draw from data collected from interviews with over 20 professional VFX practitioners from award-winning VFX companies, and I undertake socio-technical analysis of VFX using actor-network theory. I explain how high data volumes of digital information, rapid technology progression and dependencies on software pose significant preservation challenges.
In addition, I outline that by conducting holistic appraisal, selection and disposal activities across their entire digital collections, and by continuing to develop and adopt open formats; the VFX industry has improved capability to preserve first-hand evidence of their work in years to come
Towards a Semantic Interoperable Flemish Research Information Space: Development and Implementation of a Flemish Application Profile for Research Datasets
In Flanders, Research Performing Organizations (RPO) are required to provide information on publicly financed research to the Flemish Research Information Space (FRIS), a current research information system and research discovery platform hosted by the Flemish Department of Economics, Science and Innovation. FRIS currently discloses information onresearchers, research institutions, publications, and projects. Flemish decrees on Special and Industrial research funding, and the Flemish Open Science policy require RPOs to also provide metadata on research datasets to FRIS. To ensure accurate and uniform delivery of information across all information providing institutions on research datasets to FRIS, it isnecessary to develop a common application profile for research datasets. This article outlines the development of the Flemish application profile for research datasets that was developed by the Flemish Open Science Board (FOSB) WorkingGroup Metadata and Standardization. The main challenge was to achieve interoperability among stakeholders, which in part had existing metadata schemes and research information infrastructures in place, while others were still in the early stages of development
Assessment, Usability, and Sociocultural Impacts of DataONE
DataONE, funded from 2009-2019 by the U.S. National Science Foundation, is an early example of a large-scale project that built both a cyberinfrastructure and culture of data discovery, sharing, and reuse. DataONE used a Working Group model, where a diverse group of participants collaborated on targeted research and development activities to achieve broader project goals. This article summarizes the work carried out by two of DataONE’s working groups: Usability & Assessment (2009-2019) and Sociocultural Issues (2009-2014). The activities of these working groups provide a unique longitudinal look at how scientists, librarians, and other key stakeholders engaged in convergence research to identify and analyze practices around research data management through the development of boundary objects, an iterative assessment program, and reflection. Members of the working groups disseminated their findings widely in papers, presentations, and datasets, reaching international audiences through publications in 25 different journals and presentations to over 5,000 people at interdisciplinary venues. The working groups helped inform the DataONE cyberinfrastructure and influenced the evolving data management landscape. By studying working groups over time, the paper also presents lessons learned about the working group model for global large-scale projects that bring together participants from multiple disciplines and communities in convergence research
Where There's a Will, There's a Way: In-House Digitization of an Oral History Collection in a Lone-Arranger Situation
Analog audio materials present unique preservation and access challenges for even the largest libraries. These challenges are magnified for smaller institutions where budgets, staffing, and equipment limit what can be achieved. Because in-house migration to digital of analog audio is often out of reach for smaller institutions, the choice is between finding the room in the budget to out-source a project, or sit by and watch important materials decay. Cost is the most significant barrier to audio migration. Audio preservation labs can charge hundreds or even thousands of dollars to migrate analog to digital. Top-tier audio preservation equipment is equally expensive. When faced with the decomposition of an oral history collection recorded on cassette tape, one library decided that where there was a will, there was a way. The College of Education One-Room Schoolhouse Oral History Collection consisted of 247 audio cassettes containing interviews with one-room school house teachers from 68 counties in Kansas. The cassette tapes in this collection were between 20-40 years old and generally inaccessible for research due to fear the tapes could be damaged during playback. This case study looks at how a single Digital Curation Librarian with no audio digitization experience migrated nearly 200 hours of audio to digital using a $40 audio converter from Amazon and a campus subscription to Adobe Audition. This case study covers the decision to digitize the collection, the digitization process including audio clean-up, metadata collection and creation, presentation of the collection in CONTENTdm, and final preservation of audio files. The project took 20 months to complete and resulted in significant lessons learned that have informed decisions regarding future audio conversion projects.
 
Improving the Usability of Organizational Data Systems
For research data repositories, web interfaces are usually the primary, if not the only, method that data users have to interact with repository systems. Data users often search, discover, understand, access, and sometimes use data directly through repository web interfaces. Given that sub-par user interfaces can reduce the ability of users to locate, obtain, and use data, it is important to consider how repositories’ web interfaces can be evaluated and improved in order to ensure useful and successful user interactions. This paper discusses how usability assessment techniques are being applied to improve the functioning of data repository interfaces at the National Center for Atmospheric Research (NCAR). At NCAR, a new suite of data system tools is being developed and collectively called the NCAR Digital Asset Services Hub (DASH). Usability evaluation techniques have been used throughout the NCAR DASH design and implementation cycles in order to ensure that the systems work well together for the intended user base. By applying user study, paper prototype, competitive analysis, journey mapping, and heuristic evaluation, the NCAR DASH Search and Repository experiences provide examples for how data systems can benefit from usability principles and techniques. Integrating usability principles and techniques into repository system design and implementation workflows helps to optimize the systems’ overall user experience
How Long Can We Build It? Ensuring Usability of a Scientific Code Base
Software and in particular source code became an important component of scientific publications and henceforth is now subject of research data management. Maintaining source code such that it remains a usable and a valuable scientific contribution is and remains a huge task. Not all code contributions can be actively maintained forever. Eventually, there will be a significant backlog of legacy source-code. In this article we analyse the requirements for applying the concept of long-term reusability to source code. We use simple case study to identify gaps and provide a technical infrastructure based on emulator to support automated builds of historic software in form of source code.
 
Selecting Efficient and Reliable Preservation Strategies
This article addresses the problem of formulating efficient and reliable operational preservation policies that ensure bit-level information integrity over long periods, and in the presence of a diverse range of real-world technical, legal, organizational, and economic threats. We develop a systematic, quantitative prediction framework that combines formal modeling, discrete-event-based simulation, hierarchical modeling, and then use empirically calibrated sensitivity analysis to identify effective strategies.
Specifically, the framework formally defines an objective function for preservation that maps a set of preservation policies and a risk profile to a set of preservation costs, and an expected collection loss distribution. In this framework, a curator’s objective is to select optimal policies that minimize expected loss subject to budget constraints. To estimate preservation loss under different policy conditions optimal policies, we develop a statistical hierarchical risk model that includes four sources of risk: the storage hardware; the physical environment; the curating institution; and the global environment. We then employ a general discrete event-based simulation framework to evaluate the expected loss and the cost of employing varying preservation strategies under specific parameterization of risks.
The framework offers flexibility for the modeling of a wide range of preservation policies and threats. Since this framework is open source and easily deployed in a cloud computing environment, it can be used to produce analysis based on independent estimates of scenario-specific costs, reliability, and risks.
We present results summarizing hundreds of thousands of simulations using this framework. This exploratory analysis points to a number of robust and broadly applicable preservation strategies, provides novel insights into specific preservation tactics, and provides evidence that challenges received wisdom