Search CORE

11 research outputs found

Information-Balance-Aware Approximated Summarization of Data Provenance

Author
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

The relationship between trust in AI and trustworthy machine learning technologies

Author: Aitken M
Coopamootoo K
Elliott K
Toreini E
van Moorsel A
Zelaya CG
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date
Field of study

Newcastle University E-Prints

Abstracting PROV provenance graphs:A validity-preserving approach

Author: Bryans J.
Curcin V.
Gamble C.
Missier P.
Publication venue: 'Elsevier BV'
Publication date: 01/10/2020
Field of study

Data provenance is a structured form of metadata designed to record the activities and datasets involved in data production, as well as their dependency relationships. The PROV data model, released by the W3C in 2013, defines a schema and constraints that together provide a structural and semantic foundation for provenance. This enables the interoperable exchange of provenance between data producers and consumers. When the provenance content is sensitive and subject to disclosure restrictions, however, a way of hiding parts of the provenance in a principled way before communicating it to certain parties is required. In this paper we present a provenance abstraction operator that achieves this goal. It maps a graphical representation of a PROV document PG1 to a new abstract version PG2, ensuring that (i) PG2 is a valid PROV graph, and (ii) the dependencies that appear in PG2 are justified by those that appear in PG1. These two properties ensure that further abstraction of abstract PROV graphs is possible. A guiding principle of the work is that of minimum damage: the resultant graph is altered as little as possible, while ensuring that the two properties are maintained. The operator developed is implemented as part of a user tool, described in a separate paper, that lets owners of sensitive provenance information control the abstraction by specifying an abstraction policy.</p

University of Birmingham Research Portal

Coventry University Pure Portal

King's Research Portal

Hypothetical Reasoning via Provenance Abstraction

Author: Assadi S.
Balmin A.
Brezinski C.
Deutch D.
Deutch D.
Deutch D.
Deutch D.
Garey M. R.
Geerts F.
Glavic B.
Glavic B.
Ikeda R.
Lee S.
Publication venue
Publication date: 10/07/2020
Field of study

Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Previous work has shown that fine-grained data provenance can help make such an analysis more efficient: instead of a costly re-execution of the underlying application, hypothetical scenarios are applied to a pre-computed provenance expression. However, storing provenance for complex queries and large-scale data leads to a significant overhead, which is often a barrier to the incorporation of provenance-based solutions. To this end, we present a framework that allows to reduce provenance size. Our approach is based on reducing the provenance granularity using user defined abstraction trees over the provenance variables; the granularity is based on the anticipated hypothetical scenarios. We formalize the tradeoff between provenance size and supported granularity of the hypothetical reasoning, and study the complexity of the resulting optimization problem, provide efficient algorithms for tractable cases and heuristics for others. We experimentally study the performance of our solution for various queries and abstraction trees. Our study shows that the algorithms generally lead to substantial speedup of hypothetical reasoning, with a reasonable loss of accuracy

arXiv.org e-Print Archive

Crossref

Advances in database technology - EDBT 2016: 19th International Conference on Extending Database Technology, Bordeaux, France, March 15-18, 2016 : proceedings

Author
Publication venue: University of Konstanz, University Library
Publication date: 01/01/2016
Field of study

Digitale Bibliothek Thüringen

Approximated Summarization of Data Provenance

Author: Bourhis Pierre
Davidson Susan,
Deutch Daniel
Eleanor Ainy
Milo Tova
Publication venue: HAL CCSD
Publication date: 19/10/2015
Field of study

International audienceMany modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand how the resulting information was derived. Data provenance has proven helpful in this respect, however, maintaining and presenting the full and exact provenance information may be infeasible due to its size and complexity. We therefore introduce the notion of approximated summarized provenance, which provides a compact representation of the provenance at the possible cost of information loss. Based on this notion, we present a novel provenance summarization algorithm which, based on the semantics of the underlying data and the intended use of provenance, outputs a summary of the input provenance. Experiments measure the conciseness and accuracy of the resulting provenance summaries, and improvement in provenance usage time

INRIA a CCSD electronic archive server

Hal-Diderot

PROX: Approximated Summarization of Data Provenance

Author: Ainy Eleanor
Bourhis Pierre
Davidson Susan,
Deutch Daniel
Milo Tova
Publication venue: HAL CCSD
Publication date: 15/03/2016
Field of study

DémonstrationInternational audienc

Hal-Diderot