Journal of Digital Information (Texas Digital Library - TDL E-Journals)
Not a member yet
252 research outputs found
Sort by
Kindura: Repository services for researchers based on hybrid clouds
The paper describes the investigations and outcomes of the JISC-funded Kindura project, which is piloting the use of hybrid cloud infrastructure to provide repository-focused services to researchers. The hybrid cloud services integrate external commercial cloud services with internal IT infrastructure, which has been adapted to provide cloud-like interfaces. The system provides services to manage and process research outputs, primarily focusing on research data. These services include both repository services, based on use of the Fedora Commons repository, as well as common services such as preservation operations that are provided by cloud compute services. Kindura is piloting the use of the DuraCloud2, open source software developed by DuraSpace, to provide a common interface to interact with cloud storage and compute providers. A storage broker integrates with DuraCloud to optimise the usage of available resources, taking into account such factors as cost, reliability, security and performance. The development is focused on the requirements of target groups of researchers
Chempound - a Web 2.0-inspired repository for physical science data
Chempound is a new generation repository architecture based on RDF, semantic dictionaries and linked data. It has been developed to hold any type of chemical object expressible in CML and is exemplified by crystallographic experiments and computational chemistry calculations. In both examples, the repository can hold >50k entries which can be searched by SPARQL endpoints and pre-indexing of key fields. The Chempound architecture is general and adaptable to other fields of data-rich science
CLIF: Moving repositories upstream in the content lifecycle
The UK JISC-funded Content Lifecycle Integration Framework (CLIF) project has explored the management of digital content throughout its lifecycle from creation through to preservation or disposal. Whilst many individual systems offer the capability of carrying out lifecycle stages to varying degrees, CLIF recognised that only by facilitating the movement of content between systems could the full lifecycle take advantage of systems specifically geared towards different stages of the digital lifecycle. The project has also placed the digital repository at the heart of this movement and has explored this through carrying out integrations between Fedora and Sakai, and Fedora and SharePoint. This article will describe these integrations in the context of lifecycle management and highlight the issues discovered in enabling the smooth movement of content as required
DAR: A Modern Institutional Repository with a Scalability Twist
The Digital Assets Repository (DAR) is an Institutional Repository developed at the Bibliotheca Alexandrina to manage the full lifecycle of a digital asset: its creation and ingestion, its metadata management, storage and archival in addition to the necessary mechanisms for publishing and dissemination. DAR was designed with a focus on integrating DAR with different sources of digital objects and metadata in addition to integration with applications built on top of the repository. As a modern repository, the system architecture demonstrates a modular design relying on components that are best of the breed, a flexible content model for digital objects based on current standards and heavily relying on RDF triples to define relations. In this paper we will demonstrate the building blocks of DAR as an example of a modern repository, discussing how the system addresses the challenges that face an institution in consolidating its assets and a focus on solving scalability issues
Building a Community of Curatorial Practice at Penn State: A Case Study
The Penn State University Libraries and Information Technology Services (ITS) collaborated on the development of Curation Architecture Prototype Services (CAPS), a web application for ingest and management of digital objects. CAPS is built atop a prototype service platform providing atomistic curation functions in order to address the current and emerging requirements in the Libraries and ITS for digital curation, defined as “... maintaining and adding value to a trusted body of digital information for future and current use; specifically, the active management and appraisal of data over the entire life cycle” (Pennock, 2006)[7]. Additional key goals for CAPS were application of an agile-style methodology to the development process and an assessment of the resulting tool and stakeholders’ experience in the project. This article focuses in particular on the community-building aspects of CAPS, which emerged from a combination of agile-style approaches and our commitment to engage stakeholders actively throughout the process, from the construction of use cases, to decisions on metadata standards, to ingest and management functionalities of the tool. The ensuing community of curatorial practice effectively set the stage for the next iteration of CAPS, which will be devoted to planning and executing the development of a production-ready, enterprise-quality infrastructure to support publishing and curation services at Penn State
REDDNET and Digital Preservation in the Open Cloud: Research at Texas Tech University Libraries on Long-Term Archival Storage
In the realm of digital data, vendor-supplied cloud systems will still leave the user with responsibility for curation of digital data. Some of the very tasks users thought they were delegating to the cloud vendor may be a requirement for users after all. For example, cloud vendors most often require that users maintain archival copies. Beyond the better known vendor cloud model, we examine curation in two other models: inhouse clouds, and what we call "open" clouds—which are neither inhouse nor vendor. In open clouds, users come aboard as participants or partners—for example, by invitation. In open cloud systems users can develop their own software and data management, control access, and purchase their own hardware while running securely in the cloud environment. To do so will still require working within the rules of the cloud system, but in some open cloud systems those restrictions and limitations can be walked around easily with surprisingly little loss of freedom. It is in this context that REDDnet (Research and Education Data Depot network) is presented as the place where the Texas Tech University (TTU)) Libraries have been conducting research on long-term digital archival storage. The REDDnet network by year\u27s end will be at 1.2 petabytes (PB) with an additional 1.4 PB for a related project (Compact Muon Soleniod Heavy Ion [CMS-HI]); additionally there are over 200 TB of tape storage. These numbers exclude any disk space which TTU will be purchasing during the year. National Science Foundation (NSF) funding covering REDDnet and CMS-HI was in excess of 850,000 earmarked toward REDDnet. In the terminology we used above, REDDnet is an open cloud system that invited TTU Libraries to participate. This means that we run software which fits the REDDnet structure. We are beginning to complete the final design of our system, and starting to move into the first stages of construction. And we have made a decision to move forward and purchase one-half petabyte of disk storage in the initial phase. The concerns, deliberations and testing are presented here along with our initial approach
Cloud as Infrastructure at the Texas Digital Library
In this paper, we describe our recent work in using cloud computing to provision digital library services. We consider our original and current motivations, technical details of our implementation, the path we took, and our future work and lessons learned. We also compare our work with other digital library cloud efforts
Institutional Repositories, Long Term Preservation and the changing nature of Scholarly Publications
In Europe over 2.5 million publications of universities and research institutions are stored in institutional repositories. Although institutional repositories make these publications accessible over time, a repository does not have the task to preserve the content for the long term. Some countries have developed an infrastructure dedicated to sustainability. The Netherlands is one of those countries. The Dutch situation could be regarded as a successful example of how long term preservation of scholarly publications is organised through an open access environment. In this article it will be explained how this infrastructure is structured, and some preservation issues related to it will be discussed.
This contribution is based on the long term preservation studies into Enhanced Publications, performed in the FP7 project DRIVER II (2007-2009). The overall conclusion of the DRIVER studies about long term preservation is that the issues are rather of an organisational nature than of a technical one.
The nature of publications in scholarly communication is changing. Enhanced Publications and Collaborative Research Environments are new phenomena in scholarly communication using the wide range of possibilities of the digital environment in which researchers and their audience act. This rapidly changing digital environment also affects long term preservation archives. Raising awareness of long term preservation in the research community is important because researchers are responsible for public dissemination of their research output and need to understand their role in the life cycle of the digital object. Researchers should be aware that constant curation and preservation actions must be undertaken to keep the research results fit for verification, reuse, learning and history over time
Archival description in OAI-ORE
This paper proposes using OAI-ORE as the basis for a new method to represent and manage the description of archival collections. This strategy adapts traditional archival description methods for the contemporary reality of digital collections and takes advantage of the power of OAI-ORE to allow for a multitude of non-linear relationships, providing richer and more powerful access and description. A schema for representing finding aids in OAI-ORE would facilitate more sophisticated methods for modeling archival collection descriptions
Diversity and Interoperability of Repositories in a Grid Curation Environment
IT based research environments with an integrated repository component environments are increasingly important in research. While grid technologies and its relatives used to draw most attention, the e-Infrastructure community is now often looking to the repository and preservation communities to learn from their experiences. After all, trustworthy data-management and concepts to foster the agenda for data-intensive research are among the key requirements of researchers from a great variety of disciplines.
The WissGrid project aims to provide cross-disciplinary data curation tools for a grid environment by adapting repository concepts and technologies to the existing D-Grid e Infrastructure. To achieve this, it combines existing systems including Fedora, iRODS, DCache, JHove, and others. WissGrid respects diversity of systems, and aims to improve interoperability of the interfaces between those systems