Search CORE

27,248 research outputs found

JISC Preservation of Web Resources (PoWR) Handbook

Author: Ashley Kevin
Davis Richard M.
Guy Marieke
Kelly Brian
Pinsent Edward
Publication venue
Publication date: 01/11/2008
Field of study

Handbook of Web Preservation produced by the JISC-PoWR project which ran from April to November 2008. The handbook specifically addresses digital preservation issues that are relevant to the UK HE/FE web management community”. The project was undertaken jointly by UKOLN at the University of Bath and ULCC Digital Archives department

The IPAC Image Subtraction and Discovery Pipeline for the intermediate Palomar Transient Factory

Author: Barlow Tom
Bellm Eric
Cao Yi
Cenko S. Bradley
Doran Gary
Grillmair Carl
Helou George
Jackson Ed
Kasliwal Mansi
Kulkarni Shrinivas
Laher Russ
Masci Frank
Miller Adam
Ofek Eran
Prince Thomas
Rebbapragada Umaa
Shupe David
Storrie-Lombardi Lisa
Surace Jason
Yan Lin
Publication venue: 'IOP Publishing'
Publication date: 03/10/2016
Field of study

We describe the near real-time transient-source discovery engine for the intermediate Palomar Transient Factory (iPTF), currently in operations at the Infrared Processing and Analysis Center (IPAC), Caltech. We coin this system the IPAC/iPTF Discovery Engine (or IDE). We review the algorithms used for PSF-matching, image subtraction, detection, photometry, and machine-learned (ML) vetting of extracted transient candidates. We also review the performance of our ML classifier. For a limiting signal-to-noise ratio of 4 in relatively unconfused regions, "bogus" candidates from processing artifacts and imperfect image subtractions outnumber real transients by ~ 10:1. This can be considerably higher for image data with inaccurate astrometric and/or PSF-matching solutions. Despite this occasionally high contamination rate, the ML classifier is able to identify real transients with an efficiency (or completeness) of ~ 97% for a maximum tolerable false-positive rate of 1% when classifying raw candidates. All subtraction-image metrics, source features, ML probability-based real-bogus scores, contextual metadata from other surveys, and possible associations with known Solar System objects are stored in a relational database for retrieval by the various science working groups. We review our efforts in mitigating false-positives and our experience in optimizing the overall system in response to the multitude of science projects underway with iPTF.Comment: 66 pages, 21 figures, 7 tables, accepted by PAS

arXiv.org e-Print Archive

NASA Technical Reports Server

Caltech Authors

BlogForever D2.4: Weblog spider prototype and associated methodology

Author: Banos V.
Gulliksen M.
Joy M.
Manolopoulos I.
Rynning M.
Stepanyan K.
Tselepidis I.
Publication venue
Publication date: 25/10/2013
Field of study

The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype

ZENODO

Multimedia information technology and the annotation of video

Author: Jong F.M.G. de
Smeulders A.
Worring M.
Publication venue: Stichting Archiefpublicaties
Publication date: 01/01/2006
Field of study

The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

University of Twente Research Information

Bots, Seeds and People: Web Archives as Infrastructure

Author: Booms Hans
Botticelli Peter
Cook Terry
Couture Carol
Geertz Clifford
Jordan Brigitte
Mohr Gordon
Niu Jinfang
Seaver Nick
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/11/2016
Field of study

The field of web archiving provides a unique mix of human and automated agents collaborating to achieve the preservation of the web. Centuries old theories of archival appraisal are being transplanted into the sociotechnical environment of the World Wide Web with varying degrees of success. The work of the archivist and bots in contact with the material of the web present a distinctive and understudied CSCW shaped problem. To investigate this space we conducted semi-structured interviews with archivists and technologists who were directly involved in the selection of content from the web for archives. These semi-structured interviews identified thematic areas that inform the appraisal process in web archives, some of which are encoded in heuristics and algorithms. Making the infrastructure of web archives legible to the archivist, the automated agents and the future researcher is presented as a challenge to the CSCW and archival community

arXiv.org e-Print Archive

Crossref

Recommended from our members

Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline.

Author: Beckett Laurel
Chuang Kangway V
DeCarli Charles
Dugger Brittany N
Jin Lee-Way
Keiser Michael J
Tang Ziqi
Publication venue: eScholarship, University of California
Publication date: 01/05/2019
Field of study

Neuropathologists assess vast brain areas to identify diverse and subtly-differentiated morphologies. Standard semi-quantitative scoring approaches, however, are coarse-grained and lack precise neuroanatomic localization. We report a proof-of-concept deep learning pipeline that identifies specific neuropathologies-amyloid plaques and cerebral amyloid angiopathy-in immunohistochemically-stained archival slides. Using automated segmentation of stained objects and a cloud-based interface, we annotate > 70,000 plaque candidates from 43 whole slide images (WSIs) to train and evaluate convolutional neural networks. Networks achieve strong plaque classification on a 10-WSI hold-out set (0.993 and 0.743 areas under the receiver operating characteristic and precision recall curve, respectively). Prediction confidence maps visualize morphology distributions at high resolution. Resulting network-derived amyloid beta (Aβ)-burden scores correlate well with established semi-quantitative scores on a 30-WSI blinded hold-out. Finally, saliency mapping demonstrates that networks learn patterns agreeing with accepted pathologic features. This scalable means to augment a neuropathologist's ability suggests a route to neuropathologic deep phenotyping

eScholarship - University of California

DRIVER Technology Watch Report

Author: Hochstenbach Patrick
Karstens Elbaek Mikael
Russell Rosemary
Schmelz Pedersen Gerd
Van Godtsenhoven Karen
Vanderfeesten Maurice
Publication venue: DRIVER project
Publication date: 01/01/2008
Field of study

This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

Ghent University Academic Bibliography

Working with Legacy Media: A Lone Arranger\u27s First Steps

Author: Charlton Elizabeth
Publication venue: DigitalCommons@ILR
Publication date: 01/06/2016
Field of study

[Excerpt] In 2013, a naked hard drive from Fiji arriving in my small religious archives (an equivalent full-time staff of 2.5 – one archivist and two archives’ assistants) started me off on the path of digital preservation and, in particular, the digital forensics practices that are beneficial for archivists. With such a small staff, outsourced IT services, and no digital preservation policy in sight, it was time to start exploring how institutions of my size could manage legacy media and start planning for the born-digital archives that will continue to arrive. Since I hold a part-time position, I was able to undertake this exploration in my own time through the support provided by a scholarship from the Ian McLean Wards Memorial Trust in 2015

DigitalCommons@ILR

Requirements for migration of NSSD code systems from LTSS to NLTSS

Author: Pratt M.
Publication venue
Publication date
Field of study

The purpose of this document is to address the requirements necessary for a successful conversion of the Nuclear Design (ND) application code systems to the NLTSS environment. The ND application code system community can be characterized as large-scale scientific computation carried out on supercomputers. NLTSS is a distributed operating system being developed at LLNL to replace the LTSS system currently in use. The implications of change are examined including a description of the computational environment and users in ND. The discussion then turns to requirements, first in a general way, followed by specific requirements, including a proposal for managing the transition

NASA Technical Reports Server