62,524 research outputs found

    Query-Answer Causality in Databases: Abductive Diagnosis and View-Updates

    Full text link
    Causality has been recently introduced in databases, to model, characterize and possibly compute causes for query results (answers). Connections between query causality and consistency-based diagnosis and database repairs (wrt. integrity constrain violations) have been established in the literature. In this work we establish connections between query causality and abductive diagnosis and the view-update problem. The unveiled relationships allow us to obtain new complexity results for query causality -the main focus of our work- and also for the two other areas.Comment: To appear in Proc. UAI Causal Inference Workshop, 2015. One example was fixe

    Causality and the semantics of provenance

    Full text link
    Provenance, or information about the sources, derivation, custody or history of data, has been studied recently in a number of contexts, including databases, scientific workflows and the Semantic Web. Many provenance mechanisms have been developed, motivated by informal notions such as influence, dependence, explanation and causality. However, there has been little study of whether these mechanisms formally satisfy appropriate policies or even how to formalize relevant motivating concepts such as causality. We contend that mathematical models of these concepts are needed to justify and compare provenance techniques. In this paper we review a theory of causality based on structural models that has been developed in artificial intelligence, and describe work in progress on a causal semantics for provenance graphs.Comment: Workshop submissio

    A non-linear Granger-causality framework to investigate climate-vegetation dynamics

    Get PDF
    Satellite Earth observation has led to the creation of global climate data records of many important environmental and climatic variables. These come in the form of multivariate time series with different spatial and temporal resolutions. Data of this kind provide new means to further unravel the influence of climate on vegetation dynamics. However, as advocated in this article, commonly used statistical methods are often too simplistic to represent complex climate-vegetation relationships due to linearity assumptions. Therefore, as an extension of linear Granger-causality analysis, we present a novel non-linear framework consisting of several components, such as data collection from various databases, time series decomposition techniques, feature construction methods, and predictive modelling by means of random forests. Experimental results on global data sets indicate that, with this framework, it is possible to detect non-linear patterns that are much less visible with traditional Granger-causality methods. In addition, we discuss extensive experimental results that highlight the importance of considering non-linear aspects of climate-vegetation dynamics

    Causality and explanations in databases

    Get PDF
    ABSTRACT With the surge in the availability of information, there is a great demand for tools that assist users in understanding their data. While today's exploration tools rely mostly on data visualization, users often want to go deeper and understand the underlying causes of a particular observation. This tutorial surveys research on causality and explanation for data-oriented applications. We will review and summarize the research thus far into causality and explanation in the database and AI communities, giving researchers a snapshot of the current state of the art on this topic, and propose a unified framework as well as directions for future research. We will cover both the theory of causality/explanation and some applications; we also discuss the connections with other topics in database research like provenance, deletion propagation, why-not queries, and OLAP techniques. MOTIVATION With the surge in the availability of information, there is great need for tools that help users understand data. There are several examples of systems that offer some kind of assistance for users to understand and explore datasets. Humans typically observe the data at a high level of abstraction, by aggregating or by visualizing it in a graph, but often they want to go deeper and understand the ultimate causes of their observations. Over the last few years there have been several efforts in the Database and AI communities to develop general techniques to model causes, or explanations for observations on the data, some of them enabled by Judea Pearl's seminal book on Causality 1 . Causality has been formalized both for AI applications and for database queries, and formal definitions of explanations have also been proposed both in the AI and the Database literature. Given the importance of developing general purpose tools to assist * Partially supported by NSF Awards IIS-0911036 and CCF-1349784. 1 All references are omitted and will appear in the tutorial due to space limitations. users in understanding data, it is likely that research in this space will continue, perhaps even intensify. Depth and Coverage. This 1.5-hour tutorial aims at establishing a research checkpoint: its goal is to review, summarize, and systematize the research so far into causality and explanation in databases, giving researchers a snapshot of the current state of the art on this topic, and at the same time propose a unified framework for future research. We will cover a wide range of work on causality and explanation from the database and AI communities, and we will discuss the connections with other topics in database research. Intended audience. The tutorial is aimed both at active researchers in databases, and at graduate students and young researchers seeking a new research topic. Practitioners from industry might find the tutorial useful as a preview of plausible future trends in data analysis tools. Assumed Background. Basic knowledge in databases will be sufficient to follow the tutorial. Some background in Datalog, provenance, and/or OLAP would be useful, but is not necessary. COVERED TOPICS Our tutorial is divided in three thematic sections. First, we discuss the notion of causality, its foundations in AI and philosophy, and its applications in the database field. Second, we discuss how the intuition of causality can be used to explain query results. Third, we relate these notions to several other topics of database research, including provenance, missing results, and view updates. Causality Understanding causality in a broad sense is of vital importance in many practical settings, e.g., in determining legal responsibility in multi-car accidents, in diagnosing malfunction of complex systems, or in scientific inquiry. The notion of causality and causation is a topic in philosophy, studied and argued over by philosophers over the centuries. On a high level, causality characterizes the relationship between an event and an outcome: the event is a cause if the outcome is a consequence of the event. The notion of counterfactual causes, which can be traced back to Hume (1748) and is analyzed later by Lewis (1973), explains causality in an intuitive way: if the first event (cause) had not occurred, then the second event (effect) would not have occurred. Several philosophers explored an alternative approach to counterfactuals that employs structural equations. Judea Pearl's landmark book on causality defined the state-of-theart formulation of this framework. Pearl's and Halper

    Complexity-entropy causality plane: a useful approach for distinguishing songs

    Get PDF
    Nowadays we are often faced with huge databases resulting from the rapid growth of data storage technologies. This is particularly true when dealing with music databases. In this context, it is essential to have techniques and tools able to discriminate properties from these massive sets. In this work, we report on a statistical analysis of more than ten thousand songs aiming to obtain a complexity hierarchy. Our approach is based on the estimation of the permutation entropy combined with an intensive complexity measure, building up the complexity-entropy causality plane. The results obtained indicate that this representation space is very promising to discriminate songs as well as to allow a relative quantitative comparison among songs. Additionally, we believe that the here-reported method may be applied in practical situations since it is simple, robust and has a fast numerical implementation.Comment: Accepted for publication in Physica

    From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back

    Get PDF
    In this work we establish and investigate connections between causes for query answers in databases, database repairs wrt. denial constraints, and consistency-based diagnosis. The first two are relatively new research areas in databases, and the third one is an established subject in knowledge representation. We show how to obtain database repairs from causes, and the other way around. Causality problems are formulated as diagnosis problems, and the diagnoses provide causes and their responsibilities. The vast body of research on database repairs can be applied to the newer problems of computing actual causes for query answers and their responsibilities. These connections, which are interesting per se, allow us, after a transition -inspired by consistency-based diagnosis- to computational problems on hitting sets and vertex covers in hypergraphs, to obtain several new algorithmic and complexity results for database causality.Comment: To appear in Theory of Computing Systems. By invitation to special issue with extended papers from ICDT 2015 (paper arXiv:1412.4311
    • …
    corecore