1,519 research outputs found

    Query-Answer Causality in Databases: Abductive Diagnosis and View-Updates

    Full text link
    Causality has been recently introduced in databases, to model, characterize and possibly compute causes for query results (answers). Connections between query causality and consistency-based diagnosis and database repairs (wrt. integrity constrain violations) have been established in the literature. In this work we establish connections between query causality and abductive diagnosis and the view-update problem. The unveiled relationships allow us to obtain new complexity results for query causality -the main focus of our work- and also for the two other areas.Comment: To appear in Proc. UAI Causal Inference Workshop, 2015. One example was fixe

    Introducing Dynamic Behavior in Amalgamated Knowledge Bases

    Full text link
    The problem of integrating knowledge from multiple and heterogeneous sources is a fundamental issue in current information systems. In order to cope with this problem, the concept of mediator has been introduced as a software component providing intermediate services, linking data resources and application programs, and making transparent the heterogeneity of the underlying systems. In designing a mediator architecture, we believe that an important aspect is the definition of a formal framework by which one is able to model integration according to a declarative style. To this purpose, the use of a logical approach seems very promising. Another important aspect is the ability to model both static integration aspects, concerning query execution, and dynamic ones, concerning data updates and their propagation among the various data sources. Unfortunately, as far as we know, no formal proposals for logically modeling mediator architectures both from a static and dynamic point of view have already been developed. In this paper, we extend the framework for amalgamated knowledge bases, presented by Subrahmanian, to deal with dynamic aspects. The language we propose is based on the Active U-Datalog language, and extends it with annotated logic and amalgamation concepts. We model the sources of information and the mediator (also called supervisor) as Active U-Datalog deductive databases, thus modeling queries, transactions, and active rules, interpreted according to the PARK semantics. By using active rules, the system can efficiently perform update propagation among different databases. The result is a logical environment, integrating active and deductive rules, to perform queries and update propagation in an heterogeneous mediated framework.Comment: Other Keywords: Deductive databases; Heterogeneous databases; Active rules; Update

    Provenance in Collaborative Data Sharing

    Get PDF
    This dissertation focuses on recording, maintaining and exploiting provenance information in Collaborative Data Sharing Systems (CDSS). These are systems that support data sharing across loosely-coupled, heterogeneous collections of relational databases related by declarative schema mappings. A fundamental challenge in a CDSS is to support the capability of update exchange --- which publishes a participant\u27s updates and then translates others\u27 updates to the participant\u27s local schema and imports them --- while tolerating disagreement between them and recording the provenance of exchanged data, i.e., information about the sources and mappings involved in their propagation. This provenance information can be useful during update exchange, e.g., to evaluate provenance-based trust policies. It can also be exploited after update exchange, to answer a variety of user queries, about the quality, uncertainty or authority of the data, for applications such as trust assessment, ranking for keyword search over databases, or query answering in probabilistic databases. To address these challenges, in this dissertation we develop a novel model of provenance graphs that is informative enough to satisfy the needs of CDSS users and captures the semantics of query answering on various forms of annotated relations. We extend techniques from data integration, data exchange, incremental view maintenance and view update to define the formal semantics of unidirectional and bidirectional update exchange. We develop algorithms to perform update exchange incrementally while maintaining provenance information. We present strategies for implementing our techniques over an RDBMS and experimentally demonstrate their viability in the Orchestra prototype system. We define ProQL, a query language for provenance graphs that can be used by CDSS users to combine data querying with provenance testing as well as to compute annotations for their data, based on their provenance, that are useful for a variety of applications. Finally, we develop a prototype implementation ProQL over an RDBMS and indexing techniques to speed up provenance querying, evaluate experimentally the performance of provenance querying and the benefits of our indexing techniques

    Provenance for Aggregate Queries

    Get PDF
    We study in this paper provenance information for queries with aggregation. Provenance information was studied in the context of various query languages that do not allow for aggregation, and recent work has suggested to capture provenance by annotating the different database tuples with elements of a commutative semiring and propagating the annotations through query evaluation. We show that aggregate queries pose novel challenges rendering this approach inapplicable. Consequently, we propose a new approach, where we annotate with provenance information not just tuples but also the individual values within tuples, using provenance to describe the values computation. We realize this approach in a concrete construction, first for "simple" queries where the aggregation operator is the last one applied, and then for arbitrary (positive) relational algebra queries with aggregation; the latter queries are shown to be more challenging in this context. Finally, we use aggregation to encode queries with difference, and study the semantics obtained for such queries on provenance annotated databases

    bdbms -- A Database Management System for Biological Data

    Full text link
    Biologists are increasingly using databases for storing and managing their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are needed by biological databases. In this paper, we introduce bdbms, an extensible prototype database management system for supporting biological data. bdbms extends the functionalities of current DBMSs to include: (1) Annotation and provenance management including storage, indexing, manipulation, and querying of annotation and provenance as first class objects in bdbms, (2) Local dependency tracking to track the dependencies and derivations among data items, (3) Update authorization to support data curation via content-based authorization, in contrast to identity-based authorization, and (4) New access methods and their supporting operators that support pattern matching on various types of compressed biological data types. This paper presents the design of bdbms along with the techniques proposed to support these functionalities including an extension to SQL. We also outline some open issues in building bdbms.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

    Algebraic incremental maintenance of XML views

    Get PDF
    International audienceMaterialized views can bring important performance benefits when querying XML documents. In the presence of XML document changes, materialized views need to be updated to faithfully reflect the changed document. In this work, we present an algebraic approach for propagating source updates to XML materialized views expressed in a powerful XML tree pattern formalism. Our approach differs from the state of the art in the area in two important ways. First, it relies on set-oriented, algebraic operations, to be contrasted with node-based previous approaches. Second, it exploits state-of-the-art features of XML stores and XML query evaluation engines, notably XML structural identifiers and associated structural join algorithms. We present algorithms for determining how updates should be propagated to views, and highlight the benefits of our approach over existing algorithms through a series of experiments

    Semi-Automatische Deduktion von Feature-Lokalisierung während der Softwareentwicklung: Masterarbeit

    Get PDF
    Despite extensive research on software product lines in the last decades, ad-hoc clone-and-own development is still the dominant way for introducing variability to software systems. Therefore, the same issues for which software product lines were developed in the first place are still imminent in clone-and-own development: Fixing bugs consistently throughout clones and avoiding duplicate implementation effort is extremely diffcult as similarities and differences between variants are unknown. In order to remedy this, we enhance clone-and-own development with techniques from product-line engineering for targeted variant synchronisation such that domain knowledge can be integrated stepwise and without obligation. Contrary to retroactive feature mapping recovery (e.g., mining) techniques, we infer feature-to-code mappings directly during software development when concrete domain knowledge is present. In this thesis, we focus on the first step towards targeted synchronisation between variants: the recording of feature mappings. By letting developers specify on which feature they are working on, we derive feature mappings directly during software development. We ensure syntactic validity of feature mappings and variant synchronisation by implementing disciplined annotations through abstract syntax trees. To bridge the mismatch between change classification in the implementation and abstract layer, we synthesise semantic edits on abstract syntax trees. We show that our derivation can be used to reproduce variability-related real-world code changes and compare it to the feature mapping derivation of the projectional variation control system VTS by Stanciulescu et al.Trotz umfangreicher Forschung zu Software-Produktlinien in den letzten Jahrzehnten ist Clone-and-Own immer noch der dominierende Ansatz zur Einführung von Variabilität in Softwaresystemen. Daher stehen bei Clone-and-Own immer noch die gleichen Probleme im Vordergrund, für die Software-Produktlinien überhaupt erst entwickelt wurden: Die konsistente Behebung von Fehlern in allen Klonen und die Vermeidung von doppeltem Implementierungsaufwand sind äußerst schwierig, da Ähnlichkeiten und Unterschiede zwischen den Varianten unbekannt sind. Um hier Abhilfe zu schaffen, erweitern wir die Clone-and-Own-Entwicklung mit Techniken aus der Produktlinien-Entwicklung zur gezielten Synchronisierung von Varianten, sodass Entwickler ihr Domänenwissen schrittweise und unverbindlich integrieren können. Im Gegensatz zu nachträglich arbeitenden Feature-Mapping-Recovery- oder auch Mining-Techniken, leiten wir Zuordungen von Features zu Quellcode direkt während der Softwareentwicklung ab, wenn konkretes Domänenwissen vorhanden ist. In dieser Arbeit entwickeln wir den ersten Schritt zur gezielten Synchronisation von Varianten: die Aufzeichnung von Feature-Mappings. Indem Entwickler spezifizieren an welchem Feature sie arbeiten, leiten wir Feature-Mappings direkt während der Softwareentwicklung ab. Wir stellen die syntaktische Korrektheit von Feature-Mappings und der Synchronisation von Varianten sicher, indem wir disziplinierte Annotationen mithilfe von abstrakten Syntaxbäumen implementieren. Um die Diskrepanz der Klassifizierung von Änderungen zwischen der Implementierungs- und der Abstraktionsschicht zu überbrücken, synthetisieren wir Semantic Edits auf abstrakten Syntaxbäumen. Wir zeigen, dass unsere Ableitung von Feature-Mappings in der Lage ist reale Codeänderungen zu reproduzieren und vergleichen sie mit der Feature-Mapping-Ableitung des Variationskontrollsystems VTS von Stanciulescu et al
    corecore