Search CORE

45,043 research outputs found

Recommended from our members

A General-Purpose Provenance Library

Author: Macko Peter
Seltzer Margo I.
Publication venue: USENIX Associaiton
Publication date: 26/11/2012
Field of study

Most provenance capture takes place inside particular tools - a workflow engine, a database, an operating system, or an application. However, most users have an existing toolset - a collection of different tools that work well for their needs and with which they are comfortable. Currently, such users have limited ability to collect provenance without disrupting their work and changing environments, which most users are hesitant to do. Even users who are willing to adopt new tools, may realize limited benefit from provenance in those tools if they do not integrate with their entire environment, which may include multiple languages and frameworks. We present the Core Provenance Library (CPL), a portable, multi-lingual library that application programmers can easily incorporate into a variety of tools to collect and integrate provenance. Although the manual instrumentation adds extra work for application programmers, we show that in most cases, the work is minimal, and the resulting system solves several problems that plague more constrained provenance collection systems.Engineering and Applied Science

Harvard University - DASH

Language-integrated provenance in Haskell

Author: Cheney James
Stolarek Jan
Publication venue: 'Aspect-Oriented Software Association (AOSA)'
Publication date: 27/03/2018
Field of study

Scientific progress increasingly depends on data management, particularly to clean and curate data so that it can be systematically analyzed and reused. A wealth of techniques for managing and curating data (and its provenance) have been proposed, largely in the database community. In particular, a number of influential papers have proposed collecting provenance information explaining where a piece of data was copied from, or what other records were used to derive it. Most of these techniques, however, exist only as research prototypes and are not available in mainstream database systems. This means scientists must either implement such techniques themselves or (all too often) go without. This is essentially a code reuse problem: provenance techniques currently cannot be implemented reusably, only as ad hoc, usually unmaintained extensions to standard databases. An alternative, relatively unexplored approach is to support such techniques at a higher abstraction level, using metaprogramming or reflection techniques. Can advanced programming techniques make it easier to transfer provenance research results into practice? We build on a recent approach called language-integrated provenance, which extends language-integrated query techniques with source-to-source query translations that record provenance. In previous work, a proof of concept was developed in a research programming language called Links, which supports sophisticated Web and database programming. In this paper, we show how to adapt this approach to work in Haskell building on top of the Database-Supported Haskell (DSH) library. Even though it seemed clear in principle that Haskell's rich programming features ought to be sufficient, implementing language-integrated provenance in Haskell required overcoming a number of technical challenges due to interactions between these capabilities. Our implementation serves as a proof of concept showing how this combination of metaprogramming features can, for the first time, make data provenance facilities available to programmers as a library in a widely-used, general-purpose language. In our work we were successful in implementing forms of provenance known as where-provenance and lineage. We have tested our implementation using a simple database and query set and established that the resulting queries are executed correctly on the database. Our implementation is publicly available on GitHub. Our work makes provenance tracking available to users of DSH at little cost. Although Haskell is not widely used for scientific database development, our work suggests which languages features are necessary to support provenance as library. We also highlight how combining Haskell's advanced type programming features can lead to unexpected complications, which may motivate further research into type system expressiveness

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

An Architecture for Provenance Systems

Author: Groth Paul
Jiang Sheng
Miles Simon
Moreau Luc
Munroe Steve
Tan Victor
Tsasakou Sofia
Publication venue: s.n.
Publication date: 01/02/2006
Field of study

This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

Southampton (e-Prints Soton)

King's Research Portal

Users' trust in information resources in the Web environment: a status report

Author: Coventry Lynne
Gannon-Leary Pat
Pickard Alison
Publication venue: JISC
Publication date: 01/04/2010
Field of study

This study has three aims; to provide an overview of the ways in which trust is either assessed or asserted in relation to the use and provision of resources in the Web environment for research and learning; to assess what solutions might be worth further investigation and whether establishing ways to assert trust in academic information resources could assist the development of information literacy; to help increase understanding of how perceptions of trust influence the behaviour of information users

Northumbria Research Link

Assessing Descriptive Substance in Free-Text Collection-Level Metadata

Author: Han Myung-Ja K.
Jackson Amy S.
Palmer Carole L.
Zavalina Oksana L.
Publication venue
Publication date: 08/08/2008
Field of study

Collection-level metadata has the potential to provide important information about the features and purpose of individual collections. This paper reports on a content analysis of collection records in an aggregation of cultural heritage collections. The findings show that the free-text Description field often provides more accurate and complete representation of subjects and object types than the specified fields. Properties such as importance, uniqueness, comprehensiveness, provenance, and creator are articulated, as well as other vital contextual information about the intentions of a collector and the value of a collection, as a whole, for scholarly users. The results demonstrate that the semantically rich free-text Description field is essential to understanding the context of collections in large aggregations and can serve as a source of data for enhancing and customizing controlled vocabulariesIMLS NLG Research and Demonstration grant LG-06-07-0020-07published or submitted for publicationis peer reviewe

Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI)

Illinois Digital Environment for Access to Learning and Scholarship Repository

UNT Digital Library

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

DSpace University of New Mexico

Shingle 2.0: generalising self-consistent and automated domain discretisation for multi-scale geophysical models

Author: Candy Adam S.
Pietrzak Julie D.
Publication venue
Publication date: 24/03/2017
Field of study

The approaches taken to describe and develop spatial discretisations of the domains required for geophysical simulation models are commonly ad hoc, model or application specific and under-documented. This is particularly acute for simulation models that are flexible in their use of multi-scale, anisotropic, fully unstructured meshes where a relatively large number of heterogeneous parameters are required to constrain their full description. As a consequence, it can be difficult to reproduce simulations, ensure a provenance in model data handling and initialisation, and a challenge to conduct model intercomparisons rigorously. This paper takes a novel approach to spatial discretisation, considering it much like a numerical simulation model problem of its own. It introduces a generalised, extensible, self-documenting approach to carefully describe, and necessarily fully, the constraints over the heterogeneous parameter space that determine how a domain is spatially discretised. This additionally provides a method to accurately record these constraints, using high-level natural language based abstractions, that enables full accounts of provenance, sharing and distribution. Together with this description, a generalised consistent approach to unstructured mesh generation for geophysical models is developed, that is automated, robust and repeatable, quick-to-draft, rigorously verified and consistent to the source data throughout. This interprets the description above to execute a self-consistent spatial discretisation process, which is automatically validated to expected discrete characteristics and metrics.Comment: 18 pages, 10 figures, 1 table. Submitted for publication and under revie

arXiv.org e-Print Archive

Directory of Open Access Journals

nanopub-java: A Java Library for Nanopublications

Author: Kuhn Tobias
Publication venue
Publication date: 20/08/2015
Field of study

The concept of nanopublications was first proposed about six years ago, but it lacked openly available implementations. The library presented here is the first one that has become an official implementation of the nanopublication community. Its core features are stable, but it also contains unofficial and experimental extensions: for publishing to a decentralized server network, for defining sets of nanopublications with indexes, for informal assertions, and for digitally signing nanopublications. Most of the features of the library can also be accessed via an online validator interface.Comment: Proceedings of 5th Workshop on Linked Science 201

arXiv.org e-Print Archive

VU Research Portal

Digital Preservation, Archival Science and Methodological Foundations for Digital Libraries

Author: Ross Seamus
Publication venue
Publication date: 01/01/2007
Field of study

Digital libraries, whether commercial, public or personal, lie at the heart of the information society. Yet, research into their long‐term viability and the meaningful accessibility of their contents remains in its infancy. In general, as we have pointed out elsewhere, ‘after more than twenty years of research in digital curation and preservation the actual theories, methods and technologies that can either foster or ensure digital longevity remain startlingly limited.’ Research led by DigitalPreservationEurope (DPE) and the Digital Preservation Cluster of DELOS has allowed us to refine the key research challenges – theoretical, methodological and technological – that need attention by researchers in digital libraries during the coming five to ten years, if we are to ensure that the materials held in our emerging digital libraries are to remain sustainable, authentic, accessible and understandable over time. Building on this work and taking the theoretical framework of archival science as bedrock, this paper investigates digital preservation and its foundational role if digital libraries are to have long‐term viability at the centre of the global information society.