Search CORE

21,119 research outputs found

Language-integrated provenance by trace analysis

Author: Fehrenbach Stefan
Cheney James
Publication venue
Publication date: 01/01/2006
Field of study

Language-integrated provenance builds on language-integrated query techniques to make provenance information explaining query results readily available to programmers. In previous work we have explored language-integrated approaches to provenance in Links and Haskell. However, implementing a new form of provenance in a language-integrated way is still a major challenge. We propose a self-tracing transformation and trace analysis features that, together with existing techniques for type-directed generic programming, make it possible to define different forms of provenance as user code. We present our design as an extension to a core language for Links called LinksT, give examples showing its capabilities, and outline its metatheory and key correctness properties.Comment: DBPL 201

arXiv.org e-Print Archive

Edinburgh Research Explorer

University of Richmond

Language-integrated provenance by trace analysis

Author: Cheney James
Fehrenbach Stefan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Using schema transformation pathways for data lineage tracing

Author: A. Woodruff
C. Faloutsos
H. Fan
H. Fan
H. Fan
J. Albert
L. Zamboulis
L. Zamboulis
M. Boyd
P. Buneman
P. Buneman
P. McBrien
P. McBrien
P.A. Bernstein
Y. Cui
Y. Cui
Y. Cui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

With the increasing amount and diversity of information available on the Internet, there has been a huge growth in information systems that need to integrate data from distributed, heterogeneous data sources. Tracing the lineage of the integrated data is one of the problems being addressed in data warehousing research. This paper presents a data lineage tracing approach based on schema transformation pathways. Our approach is not limited to one specific data model or query language, and would be useful in any data transformation/integration framework based on sequences of primitive schema transformations

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Provenance-based trust for grid computing: Position Paper

Author: Chapman Syd
Cortes Ulises
Hempel Rolf
Moreau Luc
Rana Omer
Schreiber Andreas
Varga Laszlo
Willmott Steven
Publication venue: 'University of Southampton'
Publication date: 01/01/2004
Field of study

Current evolutions of Internet technology such as Web Services, ebXML, peer-to-peer and Grid computing all point to the development of large-scale open networks of diverse computing systems interacting with one another to perform tasks. Grid systems (and Web Services) are exemplary in this respect and are perhaps some of the first large-scale open computing systems to see widespread use - making them an important testing ground for problems in trust management which are likely to arise. From this perspective, today's grid architectures suffer from limitations, such as lack of a mechanism to trace results and lack of infrastructure to build up trust networks. These are important concerns in open grids, in which "community resources" are owned and managed by multiple stakeholders, and are dynamically organised in virtual organisations. Provenance enables users to trace how a particular result has been arrived at by identifying the individual services and the aggregation of services that produced such a particular output. Against this background, we present a research agenda to design, conceive and implement an industrial-strength open provenance architecture for grid systems. We motivate its use with three complex grid applications, namely aerospace engineering, organ transplant management and bioinformatics. Industrial-strength provenance support includes a scalable and secure architecture, an open proposal for standardising the protocols and data structures, a set of tools for configuring and using the provenance architecture, an open source reference implementation, and a deployment and validation in industrial context. The provision of such facilities will enrich grid capabilities by including new functionalities required for solving complex problems such as provenance data to provide complete audit trails of process execution and third-party analysis and auditing. As a result, we anticipate that a larger uptake of grid technology is likely to occur, since unprecedented possibilities will be offered to users and will give them a competitive edge

Institute of Transport Research:Publications

Southampton (e-Prints Soton)

Architecture for Provenance Systems

Author: Groth Paul
Miles Simon
Moreau Luc
Tan Victor
Publication venue: s.n.
Publication date: 01/10/2005
Field of study

This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

Southampton (e-Prints Soton)

An Architecture for Provenance Systems

Author: Groth Paul
Jiang Sheng
Miles Simon
Moreau Luc
Munroe Steve
Tan Victor
Tsasakou Sofia
Publication venue: s.n.
Publication date: 01/02/2006
Field of study

Southampton (e-Prints Soton)

King's Research Portal

Database Queries that Explain their Work

Author: Acar Umut A.
Ahmed Amal
Cheney James
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Provenance for database queries or scientific workflows is often motivated as providing explanation, increasing understanding of the underlying data sources and processes used to compute the query, and reproducibility, the capability to recompute the results on different inputs, possibly specialized to a part of the output. Many provenance systems claim to provide such capabilities; however, most lack formal definitions or guarantees of these properties, while others provide formal guarantees only for relatively limited classes of changes. Building on recent work on provenance traces and slicing for functional programming languages, we introduce a detailed tracing model of provenance for multiset-valued Nested Relational Calculus, define trace slicing algorithms that extract subtraces needed to explain or recompute specific parts of the output, and define query slicing and differencing techniques that support explanation. We state and prove correctness properties for these techniques and present a proof-of-concept implementation in Haskell.Comment: PPDP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

A Linked Data Approach to Sharing Workflows and Workflow Results

Author: Bechhofer S
Margaria T
Marshall MS
Missier P
Newman DR
Roos M
Roure DD
Steffen B
Zhao J
Publication venue
Publication date: 01/01/2010
Field of study

A bioinformatics analysis pipeline is often highly elaborate, due to the inherent complexity of biological systems and the variety and size of datasets. A digital equivalent of the ‘Materials and Methods’ section in wet laboratory publications would be highly beneficial to bioinformatics, for evaluating evidence and examining data across related experiments, while introducing the potential to find associated resources and integrate them as data and services. We present initial steps towards preserving bioinformatics ‘materials and methods’ by exploiting the workflow paradigm for capturing the design of a data analysis pipeline, and RDF to link the workflow, its component services, run-time provenance, and a personalized biological interpretation of the results. An example shows the reproduction of the unique graph of an analysis procedure, its results, provenance, and personal interpretation of a text mining experiment. It links data from Taverna, myExperiment.org, BioCatalogue.org, and ConceptWiki.org. The approach is relatively ‘light-weight’ and unobtrusive to bioinformatics users

Southampton (e-Prints Soton)

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

The University of Manchester - Institutional Repository