7 research outputs found

    Query with Assumptions for Probabilistic Relational Databases

    Get PDF
    Users may have prior knowledge about a probabilistic database. They prefer to query over a probabilistic database on their prior knowledge which cannot be written as component clauses of conventional SQL queries. A naive approach is to query over a new database version, which is generated by transforming the original probabilistic database to satisfy users\u27 prior knowledge; however, it is impractical to generate a different probabilistic database version for each prior knowledge. In this paper, we propose the concept of the query with assumptions which allow users to describe their prior knowledge with a newly introduced ASSUMPTION clause of SQL. We also propose an approach to obtain the result of a query based on assumption clauses. The experimental studies show our approach has better performance compared to the naive approach

    Tuple-Independent Representations of Infinite Probabilistic Databases

    Full text link
    Probabilistic databases (PDBs) are probability spaces over database instances. They provide a framework for handling uncertainty in databases, as occurs due to data integration, noisy data, data from unreliable sources or randomized processes. Most of the existing theory literature investigated finite, tuple-independent PDBs (TI-PDBs) where the occurrences of tuples are independent events. Only recently, Grohe and Lindner (PODS '19) introduced independence assumptions for PDBs beyond the finite domain assumption. In the finite, a major argument for discussing the theoretical properties of TI-PDBs is that they can be used to represent any finite PDB via views. This is no longer the case once the number of tuples is countably infinite. In this paper, we systematically study the representability of infinite PDBs in terms of TI-PDBs and the related block-independent disjoint PDBs. The central question is which infinite PDBs are representable as first-order views over tuple-independent PDBs. We give a necessary condition for the representability of PDBs and provide a sufficient criterion for representability in terms of the probability distribution of a PDB. With various examples, we explore the limits of our criteria. We show that conditioning on first order properties yields no additional power in terms of expressivity. Finally, we discuss the relation between purely logical and arithmetic reasons for (non-)representability

    Transparency: from tractability to model explanations

    Get PDF
    As artificial intelligence (AI) and machine learning (ML) models get increasingly incorporated into critical applications, ranging from medical diagnosis to loan approval, they show a tremendous potential to impact society in a beneficial way, however, this is predicated on establishing a transparent relationship between humans and automation. In particular, transparency requirements span across multiple dimensions, incorporating both technical and societal aspects, in order to promote the responsible use of AI/ML. In this thesis we present contributions along both of these axes, starting with the technical side and model transparency, where we study ways to enhance tractable probabilistic models (TPMs) with properties that enable acquiring an in-depth understanding of their decision-making process. Following this, we expand the scope of our work, studying how providing explanations about a model’s predictions influences the extent to which humans understand and collaborate with it, and finally we design an introductory course into the emerging field of explanations in AI to foster the competent use of the developed tools and methodologies. In more detail, the complex design of TPMs makes it very challenging to extract information that conveys meaningful insights, despite the fact that they are closely related to Bayesian networks (BNs), which readily provide such information. This has led to TPMs being viewed as black-boxes, in the sense that their internal representations are elusive, in contrast to BNs. The first part of this thesis challenges this view, focusing on the question of whether it is feasible to extend certain transparent features of BNs to TPMs. We start with considering the problem of transforming TPMs into alternative graphical models in a way that makes their internal representations easy to inspect. Furthermore, we study the utility of existing algorithms in causal applications, where we identify some significant limitations. To remedy this situation, we propose a set of algorithms that result in transformations that accurately uncover the internal representations of TPMs. Following this result, we look into the problem of incorporating probabilistic constraints into TPMs. Although it is well known that BNs satisfy this property, the complex structure of TPMs impedes applying the same arguments, thus advances on this problem have been very limited. Having said that, in this thesis we provide formal proofs that TPMs can be made to satisfy both probabilistic and causal constraints through parameter manipulation, showing that incorporating a constraint corresponds to solving a system of multilinear equations. We conclude the technical contributions studying the problem of generating counterfactual instances for classifiers based on TPMs, motivated by the fact that BNs are the building blocks of most standard approaches to perform this task. In this thesis we propose a novel algorithm that we prove is guaranteed to generate valid counterfactuals. The resulting algorithm takes advantage of the multilinear structure of TPMs, generalizing existing approaches, while also allowing for incorporating a priori constraints that should be respected by the final counterfactuals. In the second part of this thesis we go beyond model transparency, looking into the role of explanations in achieving an effective collaboration between human users and AI. To study this we design a behavioural experiment where we show that explanations provide unique insights, which cannot be obtained by looking at more traditional uncertainty measures. The findings of this experiment provide evidence supporting the view that explanations and uncertainty estimates have complementary functions, advocating in favour of incorporating elements of both in order to promote a synergistic relationship between humans and AI. Finally, building on our findings, in this thesis we design a course on explanations in AI, where we focus on both the technical details of state-of-the-art algorithms as well as the overarching goals, limitations, and methodological approaches in the field. This contribution aims at ensuring that users can make competent use of explanations, a need that has also been highlighted by recent large scale social initiatives. The resulting course was offered by the University of Edinburgh, at an MSc level, where student evaluations, as well as their performance, showcased the course’s effectiveness in achieving its primary goals
    corecore