10,113 research outputs found
LIBER's involvement in supporting digital preservation in member libraries
Digital curation and preservation represent new challenges for universities. LIBER
has invested considerable effort to engage with the new agendas of digital preservation
and digital curation. Through two successful phases of the LIFE project, LIBER
is breaking new ground in identifying innovative models for costing digital curation
and preservation. Through LIFE’s input into the US-UK Blue Ribbon Task Force on
Sustainable Digital Preservation and Access, LIBER is aligned with major international
work in the economics of digital preservation. In its emerging new strategy and
structures, LIBER will continue to make substantial contributions in this area, mindful
of the needs of European research libraries
Repository of NSF Funded Publications and Data Sets: "Back of Envelope" 15 year Cost Estimate
In this back of envelope study we calculate the 15 year fixed and variable costs of setting up and running a data repository (or database) to store and serve the publications and datasets derived from research funded by the National Science Foundation (NSF). Costs are computed on a yearly basis using a fixed estimate of the number of papers that are published each year that list NSF as their funding agency. We assume each paper has one dataset and estimate the size of that dataset based on experience. By our estimates, the number of papers generated each year is 64,340. The average dataset size over all seven directorates of NSF is 32 gigabytes (GB). A total amount of data added to the repository is two petabytes (PB) per year, or 30 PB over 15 years.
The architecture of the data/paper repository is based on a hierarchical storage model that uses a combination of fast disk for rapid access and tape for high reliability and cost efficient long-term storage. Data are ingested through workflows that are used in university institutional repositories, which add metadata and ensure data integrity. Average fixed costs is approximately 150 - 4.87 – 167,000,000 over 15 years of operation, curating close to one million of datasets and one million papers. After 15 years and 30 PB of data accumulated and curated, we estimate the cost per gigabyte at 167 million cost is a direct cost in that it does not include federally allowable indirect costs return (ICR).
After 15 years, it is reasonable to assume that some datasets will be compressed and rarely accessed. Others may be deemed no longer valuable, e.g., because they are replaced by more accurate results. Therefore, at some point the data growth in the repository will need to be adjusted by use of strategic preservation
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
D3.2 Cost Concept Model and Gateway Specification
This document introduces a Framework supporting the implementation of a cost concept model against which current and future cost models for curating digital assets can be benchmarked. The value built into this cost concept model leverages the comprehensive engagement by the 4C project with various user communities and builds upon our understanding of the requirements, drivers, obstacles and objectives that various stakeholder groups have relating to digital curation. Ultimately, this concept model should provide a critical input to the development and refinement of cost models as well as helping to ensure that the curation and preservation solutions and services that will inevitably arise from the commercial sector as ‘supply’ respond to a much better understood ‘demand’ for cost-effective and relevant tools. To meet acknowledged gaps in current provision, a nested model of curation which addresses both costs and benefits is provided. The goal of this task was not to create a single, functionally implementable cost modelling application; but rather to design a model based on common concepts and to develop a generic gateway specification that can be used by future model developers, service and solution providers, and by researchers in follow-up research and development projects.<p></p>
The Framework includes:<p></p>
• A Cost Concept Model—which defines the core concepts that should be included in curation costs models;<p></p>
• An Implementation Guide—for the cost concept model that provides guidance and proposes questions that should be considered when developing new cost models and refining existing cost models;<p></p>
• A Gateway Specification Template—which provides standard metadata for each of the core cost concepts and is intended for use by future model developers, model users, and service and solution providers to promote interoperability;<p></p>
• A Nested Model for Digital Curation—that visualises the core concepts, demonstrates how they interact and places them into context visually by linking them to A Cost and Benefit Model for Curation.<p></p>
This Framework provides guidance for data collection and associated calculations in an operational context but will also provide a critical foundation for more strategic thinking around curation such as the Economic Sustainability Reference Model (ESRM).<p></p>
Where appropriate, definitions of terms are provided, recommendations are made, and examples from existing models are used to illustrate the principles of the framework
- …