2,402 research outputs found
Digital Preservation Services : State of the Art Analysis
Research report funded by the DC-NET project.An overview of the state of the art in service provision for digital preservation and curation. Its focus is on the areas where bridging the gaps is needed between e-Infrastructures and efficient and forward-looking digital preservation services. Based on a desktop study and a rapid analysis of some 190 currently available tools and services for digital preservation, the deliverable provides a high-level view on the range of instruments currently on offer to support various functions within a preservation system.European Commission, FP7peer-reviewe
Transfer and Inventory Components of Developing Repository Services
4th International Conference on Open RepositoriesThis presentation was part of the session : Conference PresentationsDate: 2009-05-19 10:00 AM ā 11:30 AMAt the Library of Congress, our most basic data management needs are not surprising: How do we know what we have, where it is, and who it belongs to? How do we get files "new and legacy" from where they are to where they need to be? And how do we record and track events in the life cycle of our files? This presentation describes current work at the Library in implementing tools to meet these needs as a set of modular services -- Transfer, Transport, and Inventory -- that will fit into a larger scheme of repository services to be developed. These modular services do not equate to everything needed to call a system a repository. But this is a set of services that equate to many aspects of "ingest" and "archiving" the registry of a deposit activity, the controlled transfer and transport of files, and an inventory system that can be used to track files, record events in those files life cycles, and provide basic file-level discovery and auditing. This is the first stage in the development of a suite of tools to help the Library ensure long-term stewardship of its digital assets
Closing the loop: assisting archival appraisal and information retrieval in one sweep
In this article, we examine the similarities between the concept of appraisal, a process that takes place within the archives, and the concept of relevance judgement, a process fundamental to the evaluation of information retrieval systems. More specifically, we revisit selection criteria proposed as result of archival research, and work within the digital curation communities, and, compare them to relevance criteria as discussed within information retrieval's literature based discovery. We illustrate how closely these criteria relate to each other and discuss how understanding the relationships between the these disciplines could form a basis for proposing automated selection for archival processes and initiating multi-objective learning with respect to information retrieval
Detecting Family Resemblance: Automated Genre Classification.
This paper presents results in automated genre classification of digital documents in PDF format. It describes genre classification as an important ingredient in contextualising scientific data and in retrieving targetted material for improving research. The current paper compares the role of visual layout, stylistic features and language model features in clustering documents and presents results in retrieving five selected genres (Scientific Article, Thesis, Periodicals, Business Report, and Form) from a pool of materials populated with documents of the nineteen most popular genres found in our experimental data set.
Searching for Ground Truth: a stepping stone in automating genre classification
This paper examines genre classification of documents and
its role in enabling the effective automated management of digital documents by digital libraries and other repositories. We have previously presented genre classification as a valuable step toward achieving automated extraction of descriptive metadata for digital material. Here, we present results from experiments using human labellers, conducted to assist in genre characterisation and the prediction of obstacles which need to be overcome by an automated system, and to contribute to the process of creating a solid testbed corpus for extending automated genre classification and testing metadata extraction tools across genres. We also describe the performance of two classifiers based on image and stylistic modeling features in labelling the data resulting from the agreement of three human labellers across fifteen genre classes.
CLEAR: a credible method to evaluate website archivability
Web archiving is crucial to ensure that cultural, scientific
and social heritage on the web remains accessible and usable
over time. A key aspect of the web archiving process is optimal data extraction from target websites. This procedure is
diļ¬cult for such reasons as, website complexity, plethora of
underlying technologies and ultimately the open-ended nature of the web. The purpose of this work is to establish
the notion of Website Archivability (WA) and to introduce
the Credible Live Evaluation of Archive Readiness (CLEAR)
method to measure WA for any website. Website Archivability captures the core aspects of a website crucial in diagnosing whether it has the potentiality to be archived with completeness and accuracy. An appreciation of the archivability
of a web site should provide archivists with a valuable tool
when assessing the possibilities of archiving material and in-
uence web design professionals to consider the implications
of their design decisions on the likelihood could be archived.
A prototype application, archiveready.com, has been established to demonstrate the viabiity of the proposed method
for assessing Website Archivability
Storia: Summarizing Social Media Content based on Narrative Theory using Crowdsourcing
People from all over the world use social media to share thoughts and
opinions about events, and understanding what people say through these channels
has been of increasing interest to researchers, journalists, and marketers
alike. However, while automatically generated summaries enable people to
consume large amounts of data efficiently, they do not provide the context
needed for a viewer to fully understand an event. Narrative structure can
provide templates for the order and manner in which this data is presented to
create stories that are oriented around narrative elements rather than
summaries made up of facts. In this paper, we use narrative theory as a
framework for identifying the links between social media content. To do this,
we designed crowdsourcing tasks to generate summaries of events based on
commonly used narrative templates. In a controlled study, for certain types of
events, people were more emotionally engaged with stories created with
narrative structure and were also more likely to recommend them to others
compared to summaries created without narrative structure
Automating Generative Deep Learning for Artistic Purposes: Challenges and Opportunities
We present a framework for automating generative deep learning with a
specific focus on artistic applications. The framework provides opportunities
to hand over creative responsibilities to a generative system as targets for
automation. For the definition of targets, we adopt core concepts from
automated machine learning and an analysis of generative deep learning
pipelines, both in standard and artistic settings. To motivate the framework,
we argue that automation aligns well with the goal of increasing the creative
responsibility of a generative system, a central theme in computational
creativity research. We understand automation as the challenge of granting a
generative system more creative autonomy, by framing the interaction between
the user and the system as a co-creative process. The development of the
framework is informed by our analysis of the relationship between automation
and creative autonomy. An illustrative example shows how the framework can give
inspiration and guidance in the process of handing over creative
responsibility
- ā¦