21,748 research outputs found

    BlogForever: D3.1 Preservation Strategy Report

    Get PDF
    This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design

    Extraction of Projection Profile, Run-Histogram and Entropy Features Straight from Run-Length Compressed Text-Documents

    Full text link
    Document Image Analysis, like any Digital Image Analysis requires identification and extraction of proper features, which are generally extracted from uncompressed images, though in reality images are made available in compressed form for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation induces the motivation to research in extracting features directly from the compressed image. In this research, we propose to extract essential features such as projection profile, run-histogram and entropy for text document analysis directly from run-length compressed text-documents. The experimentation illustrates that features are extracted directly from the compressed image without going through the stage of decompression, because of which the computing time is reduced. The feature values so extracted are exactly identical to those extracted from uncompressed images.Comment: Published by IEEE in Proceedings of ACPR-2013. arXiv admin note: text overlap with arXiv:1403.778

    A Comparative Study on Compression of Different Image File Formats

    Get PDF
    Advances in imaging technology and computer communications have provided users with a variety of new services that use images, including video conferencing, videophones, multimedia system and High Density television. To fully utilize such a high tech communication system, image compression techniques play an important role in transmission and storage of information. In this project, an image compression format and algorithm has been analyses. A system has been created for user to convert or display image file format. The ScanJet 11C Scanner was used to scan the image. The image was coded into BMP, TIFF files, and GIF decoded and displayed on the screen. Some of the experiment was done twice with two different types of images to ensure that the results were accurate. The Algorithms used to compress and decompress the image ware Run Length Encoding (RLE) algorithm and Lampel-ziv and Welch (LZW) algorithm. One system was developed for users to convert the image file format to enable them to view the size of each image and to display image. From the study, it can be concluded that LZW algorithm is better than RLE algorithm in term of percentage compression. Beside that, the quality of image that can be produced by LZW algorithm and RLE algorithm is almost the same

    Survey of Federal, National, and International standards applicable to the NASA applications data services

    Get PDF
    An applications data service (ADS) was developed to meet the challenges in the data access and integration. The ADS provides a common service to locate and access applications data electronically and integrate the cross correlative data sets required by multiple users. Its catalog and network services increase data visibility as well as provide the data in a more rapid manner and a usable form

    DRIVER Technology Watch Report

    Get PDF
    This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

    Lifecycle information for e-literature: full report from the LIFE project

    Get PDF
    This Report is a record of the LIFE Project. The Project has been run for one year and its aim is to deliver crucial information about the cost and management of digital material. This information should then in turn be able to be applied to any institution that has an interest in preserving and providing access to electronic collections. The Project is a joint venture between The British Library and UCL Library Services. The Project is funded by JISC under programme area (i) as listed in paragraph 16 of the JISC 4/04 circular- Institutional Management Support and Collaboration and as such has set requirements and outcomes which must be met and the Project has done its best to do so. Where the Project has been unable to answer specific questions, strong recommendations have been made for future Project work to do so. The outcomes of this Project are expected to be a practical set of guidelines and a framework within which costs can be applied to digital collections in order to answer the following questions: • What is the long term cost of preserving digital material; • Who is going to do it; • What are the long term costs for a library in HE/FE to partner with another institution to carry out long term archiving; • What are the comparative long-term costs of a paper and digital copy of the same publication; • At what point will there be sufficient confidence in the stability and maturity of digital preservation to switch from paper for publications available in parallel formats; • What are the relative risks of digital versus paper archiving. The Project has attempted to answer these questions by using a developing lifecycle methodology and three diverse collections of digital content. The LIFE Project team chose UCL e-journals, BL Web Archiving and the BL VDEP digital collections to provide a strong challenge to the methodology as well as to help reach the key Project aim of attributing long term cost to digital collections. The results from the Case Studies and the Project findings are both surprising and illuminating
    corecore