55,419 research outputs found
Database Queries that Explain their Work
Provenance for database queries or scientific workflows is often motivated as
providing explanation, increasing understanding of the underlying data sources
and processes used to compute the query, and reproducibility, the capability to
recompute the results on different inputs, possibly specialized to a part of
the output. Many provenance systems claim to provide such capabilities;
however, most lack formal definitions or guarantees of these properties, while
others provide formal guarantees only for relatively limited classes of
changes. Building on recent work on provenance traces and slicing for
functional programming languages, we introduce a detailed tracing model of
provenance for multiset-valued Nested Relational Calculus, define trace slicing
algorithms that extract subtraces needed to explain or recompute specific parts
of the output, and define query slicing and differencing techniques that
support explanation. We state and prove correctness properties for these
techniques and present a proof-of-concept implementation in Haskell.Comment: PPDP 201
SHARP: Harmonizing Galaxy and Taverna workflow provenance
International audienceSHARP is a Linked Data approach for harmonizing cross-workflow provenance. In this demo, we demonstrate SHARP through a real-world omic experiment involving workflow traces generated by Taverna and Galaxy systems. SHARP starts by interlinking provenance traces generated by Galaxy and Taverna workflows and then harmonize the interlinked graphs thanks to OWL and PROV inference rules. The resulting provenance graph can be exploited for answering queries across Galaxy and Taverna workflow runs
Automatic vs Manual Provenance Abstractions: Mind the Gap
In recent years the need to simplify or to hide sensitive information in
provenance has given way to research on provenance abstraction. In the context
of scientific workflows, existing research provides techniques to semi
automatically create abstractions of a given workflow description, which is in
turn used as filters over the workflow's provenance traces. An alternative
approach that is commonly adopted by scientists is to build workflows with
abstractions embedded into the workflow's design, such as using sub-workflows.
This paper reports on the comparison of manual versus semi-automated approaches
in a context where result abstractions are used to filter report-worthy results
of computational scientific analyses. Specifically; we take a real-world
workflow containing user-created design abstractions and compare these with
abstractions created by ZOOM UserViews and Workflow Summaries systems. Our
comparison shows that semi-automatic and manual approaches largely overlap from
a process perspective, meanwhile, there is a dramatic mismatch in terms of data
artefacts retained in an abstracted account of derivation. We discuss reasons
and suggest future research directions.Comment: Preprint accepted to the 2016 workshop on the Theory and Applications
of Provenance, TAPP 201
Structural analysis of whole-system provenance graphs
System based provenance generates traces captured from
various systems, a representation method for inferring these traces is
a graph. These graphs are not well understood, and current work focuses
on their extraction and processing, without a thorough characterization
being in place. This paper studies the topology of such graphs. We an-
alyze multiple Whole-system-Provenance graphs and present that they
have hubs-and-authorities model of graphs as well as a power law distri-
bution. Our observations allow for a novel understanding of the structure
of Whole-system-Provenance graphs.DARP
Reproducibility of scientific workflows execution using cloud-aware provenance (ReCAP)
© 2018, Springer-Verlag GmbH Austria, part of Springer Nature. Provenance of scientific workflows has been considered a mean to provide workflow reproducibility. However, the provenance approaches adopted so far are not applicable in the context of Cloud because the provenance trace lacks the Cloud information. This paper presents a novel approach that collects the Cloud-aware provenance and represents it as a graph. The workflow execution reproducibility on the Cloud is determined by comparing the workflow provenance at three levels i.e., workflow structure, execution infrastructure and workflow outputs. The experimental evaluation shows that the implemented approach can detect changes in the provenance traces and the outputs produced by the workflow
PRNU-based image classification of origin social network with CNN
A huge amount of images are continuously shared on social networks (SNs) daily and, in most of cases, it is very difficult to reliably establish the SN of provenance of an image when it is recovered from a hard disk, a SD card or a smartphone memory. During an investigation, it could be crucial to be able to distinguish images coming directly from a photo-camera with respect to those downloaded from a social network and possibly, in this last circumstance, determining which is the SN among a defined group. It is well known that each SN leaves peculiar traces on each content during the upload-download process; such traces can be exploited to make image classification. In this work, the idea is to use the PRNU, embedded in every acquired images, as the “carrier” of the particular SN traces which diversely modulate the PRNU. We demonstrate, in this paper, that SN-modulated noise residual can be adopted as a feature to detect the social network of origin by means of a trained convolutional neural network (CNN)
- …