38,670 research outputs found

    VXA: A Virtual Architecture for Durable Compressed Archives

    Full text link
    Data compression algorithms change frequently, and obsolete decoders do not always run on new hardware and operating systems, threatening the long-term usability of content archived using those algorithms. Re-encoding content into new formats is cumbersome, and highly undesirable when lossy compression is involved. Processor architectures, in contrast, have remained comparatively stable over recent decades. VXA, an archival storage system designed around this observation, archives executable decoders along with the encoded content it stores. VXA decoders run in a specialized virtual machine that implements an OS-independent execution environment based on the standard x86 architecture. The VXA virtual machine strictly limits access to host system services, making decoders safe to run even if an archive contains malicious code. VXA's adoption of a "native" processor architecture instead of type-safe language technology allows reuse of existing "hand-optimized" decoders in C and assembly language, and permits decoders access to performance-enhancing architecture features such as vector processing instructions. The performance cost of VXA's virtualization is typically less than 15% compared with the same decoders running natively. The storage cost of archived decoders, typically 30-130KB each, can be amortized across many archived files sharing the same compression method.Comment: 14 pages, 7 figures, 2 table

    SEARS: Space Efficient And Reliable Storage System in the Cloud

    Full text link
    Today's cloud storage services must offer storage reliability and fast data retrieval for large amount of data without sacrificing storage cost. We present SEARS, a cloud-based storage system which integrates erasure coding and data deduplication to support efficient and reliable data storage with fast user response time. With proper association of data to storage server clusters, SEARS provides flexible mixing of different configurations, suitable for real-time and archival applications. Our prototype implementation of SEARS over Amazon EC2 shows that it outperforms existing storage systems in storage efficiency and file retrieval time. For 3 MB files, SEARS delivers retrieval time of 2.52.5 s compared to 77 s with existing systems.Comment: 4 pages, IEEE LCN 201

    Critique of Architectures for Long-Term Digital Preservation

    Get PDF
    Evolving technology and fading human memory threaten the long-term intelligibility of many kinds of documents. Furthermore, some records are susceptible to improper alterations that make them untrustworthy. Trusted Digital Repositories (TDRs) and Trustworthy Digital Objects (TDOs) seem to be the only broadly applicable digital preservation methodologies proposed. We argue that the TDR approach has shortfalls as a method for long-term digital preservation of sensitive information. Comparison of TDR and TDO methodologies suggests differentiating near-term preservation measures from what is needed for the long term. TDO methodology addresses these needs, providing for making digital documents durably intelligible. It uses EDP standards for a few file formats and XML structures for text documents. For other information formats, intelligibility is assured by using a virtual computer. To protect sensitive information—content whose inappropriate alteration might mislead its readers, the integrity and authenticity of each TDO is made testable by embedded public-key cryptographic message digests and signatures. Key authenticity is protected recursively in a social hierarchy. The proper focus for long-term preservation technology is signed packages that each combine a record collection with its metadata and that also bind context—Trustworthy Digital Objects.

    A Guide to Distributed Digital Preservation

    Get PDF
    This volume is devoted to the broad topic of distributed digital preservation, a still-emerging field of practice for the cultural memory arena. Replication and distribution hold out the promise of indefinite preservation of materials without degradation, but establishing effective organizational and technical processes to enable this form of digital preservation is daunting. Institutions need practical examples of how this task can be accomplished in manageable, low-cost ways."--P. [4] of cove

    Addressing the tacit knowledge of a digital library system

    Get PDF
    Recent surveys, about the Linked Data initiatives in library organizations, report the experimental nature of related projects and the difficulty in re-using data to provide improvements of library services. This paper presents an approach for managing data and its "tacit" organizational knowledge, as the originating data context, improving the interpretation of data meaning. By analyzing a Digital Libray system, we prototyped a method for turning data management into a "semantic data management", where local system knowledge is managed as a data, and natively foreseen as a Linked Data. Semantic data management aims to curates the correct consumers' understanding of Linked Datasets, driving to a proper re-use

    Expressing the tacit knowledge of a digital library system as linked data

    Get PDF
    Library organizations have enthusiastically undertaken semantic web initiatives and in particular the data publishing as linked data. Nevertheless, different surveys report the experimental nature of initiatives and the consumer difficulty in re-using data. These barriers are a hindrance for using linked datasets, as an infrastructure that enhances the library and related information services. This paper presents an approach for encoding, as a Linked Vocabulary, the "tacit" knowledge of the information system that manages the data source. The objective is the improvement of the interpretation process of the linked data meaning of published datasets. We analyzed a digital library system, as a case study, for prototyping the "semantic data management" method, where data and its knowledge are natively managed, taking into account the linked data pillars. The ultimate objective of the semantic data management is to curate the correct consumers' interpretation of data, and to facilitate the proper re-use. The prototype defines the ontological entities representing the knowledge, of the digital library system, that is not stored in the data source, nor in the existing ontologies related to the system's semantics. Thus we present the local ontology and its matching with existing ontologies, Preservation Metadata Implementation Strategies (PREMIS) and Metadata Objects Description Schema (MODS), and we discuss linked data triples prototyped from the legacy relational database, by using the local ontology. We show how the semantic data management, can deal with the inconsistency of system data, and we conclude that a specific change in the system developer mindset, it is necessary for extracting and "codifying" the tacit knowledge, which is necessary to improve the data interpretation process

    Recording, Documentation, and Information Management for the Conservation of Heritage Places: Guiding Principles

    Get PDF
    Provides guidance on integrating recording, documentation, and information management of territories, sites, groups of buildings, or monuments into the conservation process; evaluating proposals; consulting specialists; and controlling implementation

    Digitally Archiving Architectural Models and Exhibition Designs: The Case of an Art Museum

    Get PDF
    [Excerpt] In 2013, a medium-sized art museum located in the Northeast United States received a grant to plan for an electronic records repository. This museum will be referred to here as USAM for brevity. Working as the electronic records consultant on this project, the first major task was to research and inventory the electronic records being created and already existing at the museum, which necessitated scans of network storage, focus groups with departmental staff, and investigations of media included in the physical archives. In engaging in this research process, certain document types were expected, such as image files, word processed documents and spreadsheets. Although documents of these types were indeed plentiful, an extensive quantity of digitally produced two-dimensional drawings (2D) and three-dimensional models (3D) were found. Specifically, over 37,000 CAD drawings were unearthed during a network storage inventory project, as well as over 6,000 3D models. These files originate primarily in VectorWorks (and its predecessor MiniCAD), AutoCAD, and Rhinoceros. Given the quantity of digitally produced models and drawings existing at USAM, and the need to plan for an electronic records repository, this project is motivated by the following question: By what methods can two-dimensional CAD drawings (2D) and three-dimensional (3D) models be digitally archived for long term preservation and access? To answer this question, a review of the relevant literature is first presented, which explores the methods that have been developed for archiving architectural models and exhibition designs. Second, the study methods are presented, which include more detail on the context as well the archiving tests that were conducted. The paper concludes with results and conclusions regarding how architectural models and exhibitions designs are archived at USAM
    • …
    corecore