27,248 research outputs found

    JISC Preservation of Web Resources (PoWR) Handbook

    Get PDF
    Handbook of Web Preservation produced by the JISC-PoWR project which ran from April to November 2008. The handbook specifically addresses digital preservation issues that are relevant to the UK HE/FE web management community”. The project was undertaken jointly by UKOLN at the University of Bath and ULCC Digital Archives department

    The IPAC Image Subtraction and Discovery Pipeline for the intermediate Palomar Transient Factory

    Get PDF
    We describe the near real-time transient-source discovery engine for the intermediate Palomar Transient Factory (iPTF), currently in operations at the Infrared Processing and Analysis Center (IPAC), Caltech. We coin this system the IPAC/iPTF Discovery Engine (or IDE). We review the algorithms used for PSF-matching, image subtraction, detection, photometry, and machine-learned (ML) vetting of extracted transient candidates. We also review the performance of our ML classifier. For a limiting signal-to-noise ratio of 4 in relatively unconfused regions, "bogus" candidates from processing artifacts and imperfect image subtractions outnumber real transients by ~ 10:1. This can be considerably higher for image data with inaccurate astrometric and/or PSF-matching solutions. Despite this occasionally high contamination rate, the ML classifier is able to identify real transients with an efficiency (or completeness) of ~ 97% for a maximum tolerable false-positive rate of 1% when classifying raw candidates. All subtraction-image metrics, source features, ML probability-based real-bogus scores, contextual metadata from other surveys, and possible associations with known Solar System objects are stored in a relational database for retrieval by the various science working groups. We review our efforts in mitigating false-positives and our experience in optimizing the overall system in response to the multitude of science projects underway with iPTF.Comment: 66 pages, 21 figures, 7 tables, accepted by PAS

    BlogForever D2.4: Weblog spider prototype and associated methodology

    Get PDF
    The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype

    Multimedia information technology and the annotation of video

    Get PDF
    The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

    Bots, Seeds and People: Web Archives as Infrastructure

    Full text link
    The field of web archiving provides a unique mix of human and automated agents collaborating to achieve the preservation of the web. Centuries old theories of archival appraisal are being transplanted into the sociotechnical environment of the World Wide Web with varying degrees of success. The work of the archivist and bots in contact with the material of the web present a distinctive and understudied CSCW shaped problem. To investigate this space we conducted semi-structured interviews with archivists and technologists who were directly involved in the selection of content from the web for archives. These semi-structured interviews identified thematic areas that inform the appraisal process in web archives, some of which are encoded in heuristics and algorithms. Making the infrastructure of web archives legible to the archivist, the automated agents and the future researcher is presented as a challenge to the CSCW and archival community

    DRIVER Technology Watch Report

    Get PDF
    This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

    Working with Legacy Media: A Lone Arranger\u27s First Steps

    Get PDF
    [Excerpt] In 2013, a naked hard drive from Fiji arriving in my small religious archives (an equivalent full-time staff of 2.5 – one archivist and two archives’ assistants) started me off on the path of digital preservation and, in particular, the digital forensics practices that are beneficial for archivists. With such a small staff, outsourced IT services, and no digital preservation policy in sight, it was time to start exploring how institutions of my size could manage legacy media and start planning for the born-digital archives that will continue to arrive. Since I hold a part-time position, I was able to undertake this exploration in my own time through the support provided by a scholarship from the Ian McLean Wards Memorial Trust in 2015

    Requirements for migration of NSSD code systems from LTSS to NLTSS

    Get PDF
    The purpose of this document is to address the requirements necessary for a successful conversion of the Nuclear Design (ND) application code systems to the NLTSS environment. The ND application code system community can be characterized as large-scale scientific computation carried out on supercomputers. NLTSS is a distributed operating system being developed at LLNL to replace the LTSS system currently in use. The implications of change are examined including a description of the computational environment and users in ND. The discussion then turns to requirements, first in a general way, followed by specific requirements, including a proposal for managing the transition
    • …
    corecore