14,445 research outputs found

    On the Limitations of Provenance for Queries With Difference

    Get PDF
    The annotation of the results of database transformations was shown to be very effective for various applications. Until recently, most works in this context focused on positive query languages. The provenance semirings is a particular approach that was proven effective for these languages, and it was shown that when propagating provenance with semirings, the expected equivalence axioms of the corresponding query languages are satisfied. There have been several attempts to extend the framework to account for relational algebra queries with difference. We show here that these suggestions fail to satisfy some expected equivalence axioms (that in particular hold for queries on "standard" set and bag databases). Interestingly, we show that this is not a pitfall of these particular attempts, but rather every such attempt is bound to fail in satisfying these axioms, for some semirings. Finally, we show particular semirings for which an extension for supporting difference is (im)possible.Comment: TAPP 201

    A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD

    Full text link
    In order to make results of computational scientific research findable, accessible, interoperable and re-usable, it is necessary to decorate them with standardised metadata. However, there are a number of technical and practical challenges that make this process difficult to achieve in practice. Here the implementation of a protocol is presented to tag crystal structures with their computed properties, without the need of human intervention to curate the data. This protocol leverages the capabilities of AiiDA, an open-source platform to manage and automate scientific computational workflows, and TCOD, an open-access database storing computed materials properties using a well-defined and exhaustive ontology. Based on these, the complete procedure to deposit computed data in the TCOD database is automated. All relevant metadata are extracted from the full provenance information that AiiDA tracks and stores automatically while managing the calculations. Such a protocol also enables reproducibility of scientific data in the field of computational materials science. As a proof of concept, the AiiDA-TCOD interface is used to deposit 170 theoretical structures together with their computed properties and their full provenance graphs, consisting in over 4600 AiiDA nodes

    An Architecture for Provenance Systems

    No full text
    This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

    Provenance Threat Modeling

    Full text link
    Provenance systems are used to capture history metadata, applications include ownership attribution and determining the quality of a particular data set. Provenance systems are also used for debugging, process improvement, understanding data proof of ownership, certification of validity, etc. The provenance of data includes information about the processes and source data that leads to the current representation. In this paper we study the security risks provenance systems might be exposed to and recommend security solutions to better protect the provenance information.Comment: 4 pages, 1 figure, conferenc

    Architecture for Provenance Systems

    No full text
    This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

    Virtual Data in CMS Analysis

    Full text link
    The use of virtual data for enhancing the collaboration between large groups of scientists is explored in several ways: - by defining ``virtual'' parameter spaces which can be searched and shared in an organized way by a collaboration of scientists in the course of their analysis; - by providing a mechanism to log the provenance of results and the ability to trace them back to the various stages in the analysis of real or simulated data; - by creating ``check points'' in the course of an analysis to permit collaborators to explore their own analysis branches by refining selections, improving the signal to background ratio, varying the estimation of parameters, etc.; - by facilitating the audit of an analysis and the reproduction of its results by a different group, or in a peer review context. We describe a prototype for the analysis of data from the CMS experiment based on the virtual data system Chimera and the object-oriented data analysis framework ROOT. The Chimera system is used to chain together several steps in the analysis process including the Monte Carlo generation of data, the simulation of detector response, the reconstruction of physics objects and their subsequent analysis, histogramming and visualization using the ROOT framework.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, 7 eps figures. PSN TUAT010. V2 - references adde
    • ā€¦
    corecore