18,981 research outputs found

    Algorithms for Provisioning Queries and Analytics

    Get PDF
    Provisioning is a technique for avoiding repeated expensive computations in what-if analysis. Given a query, an analyst formulates kk hypotheticals, each retaining some of the tuples of a database instance, possibly overlapping, and she wishes to answer the query under scenarios, where a scenario is defined by a subset of the hypotheticals that are "turned on". We say that a query admits compact provisioning if given any database instance and any kk hypotheticals, one can create a poly-size (in kk) sketch that can then be used to answer the query under any of the 2k2^{k} possible scenarios without accessing the original instance. In this paper, we focus on provisioning complex queries that combine relational algebra (the logical component), grouping, and statistics/analytics (the numerical component). We first show that queries that compute quantiles or linear regression (as well as simpler queries that compute count and sum/average of positive values) can be compactly provisioned to provide (multiplicative) approximate answers to an arbitrary precision. In contrast, exact provisioning for each of these statistics requires the sketch size to be exponential in kk. We then establish that for any complex query whose logical component is a positive relational algebra query, as long as the numerical component can be compactly provisioned, the complex query itself can be compactly provisioned. On the other hand, introducing negation or recursion in the logical component again requires the sketch size to be exponential in kk. While our positive results use algorithms that do not access the original instance after a scenario is known, we prove our lower bounds even for the case when, knowing the scenario, limited access to the instance is allowed

    Logic Programming as Constructivism

    Get PDF
    The features of logic programming that seem unconventional from the viewpoint of classical logic can be explained in terms of constructivistic logic. We motivate and propose a constructivistic proof theory of non-Horn logic programming. Then, we apply this formalization for establishing results of practical interest. First, we show that 'stratification can be motivated in a simple and intuitive way. Relying on similar motivations, we introduce the larger classes of 'loosely stratified' and 'constructively consistent' programs. Second, we give a formal basis for introducing quantifiers into queries and logic programs by defining 'constructively domain independent* formulas. Third, we extend the Generalized Magic Sets procedure to loosely stratified and constructively consistent programs, by relying on a 'conditional fixpoini procedure

    Provably-secure symmetric private information retrieval with quantum cryptography

    Full text link
    Private information retrieval (PIR) is a database query protocol that provides user privacy, in that the user can learn a particular entry of the database of his interest but his query would be hidden from the data centre. Symmetric private information retrieval (SPIR) takes PIR further by additionally offering database privacy, where the user cannot learn any additional entries of the database. Unconditionally secure SPIR solutions with multiple databases are known classically, but are unrealistic because they require long shared secret keys between the parties for secure communication and shared randomness in the protocol. Here, we propose using quantum key distribution (QKD) instead for a practical implementation, which can realise both the secure communication and shared randomness requirements. We prove that QKD maintains the security of the SPIR protocol and that it is also secure against any external eavesdropper. We also show how such a classical-quantum system could be implemented practically, using the example of a two-database SPIR protocol with keys generated by measurement device-independent QKD. Through key rate calculations, we show that such an implementation is feasible at the metropolitan level with current QKD technology.Comment: 19 page

    Hypothetical answers to continuous queries over data streams

    Full text link
    Continuous queries over data streams may suffer from blocking operations and/or unbound wait, which may delay answers until some relevant input arrives through the data stream. These delays may turn answers, when they arrive, obsolete to users who sometimes have to make decisions with no help whatsoever. Therefore, it can be useful to provide hypothetical answers - "given the current information, it is possible that X will become true at time t" - instead of no information at all. In this paper we present a semantics for queries and corresponding answers that covers such hypothetical answers, together with an online algorithm for updating the set of facts that are consistent with the currently available information

    Integrating and Ranking Uncertain Scientific Data

    Get PDF
    Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates
    corecore