18,981 research outputs found
Algorithms for Provisioning Queries and Analytics
Provisioning is a technique for avoiding repeated expensive computations in
what-if analysis. Given a query, an analyst formulates hypotheticals, each
retaining some of the tuples of a database instance, possibly overlapping, and
she wishes to answer the query under scenarios, where a scenario is defined by
a subset of the hypotheticals that are "turned on". We say that a query admits
compact provisioning if given any database instance and any hypotheticals,
one can create a poly-size (in ) sketch that can then be used to answer the
query under any of the possible scenarios without accessing the
original instance.
In this paper, we focus on provisioning complex queries that combine
relational algebra (the logical component), grouping, and statistics/analytics
(the numerical component). We first show that queries that compute quantiles or
linear regression (as well as simpler queries that compute count and
sum/average of positive values) can be compactly provisioned to provide
(multiplicative) approximate answers to an arbitrary precision. In contrast,
exact provisioning for each of these statistics requires the sketch size to be
exponential in . We then establish that for any complex query whose logical
component is a positive relational algebra query, as long as the numerical
component can be compactly provisioned, the complex query itself can be
compactly provisioned. On the other hand, introducing negation or recursion in
the logical component again requires the sketch size to be exponential in .
While our positive results use algorithms that do not access the original
instance after a scenario is known, we prove our lower bounds even for the case
when, knowing the scenario, limited access to the instance is allowed
Logic Programming as Constructivism
The features of logic programming that
seem unconventional from the viewpoint of classical logic
can be explained in terms of constructivistic logic. We
motivate and propose a constructivistic proof theory of
non-Horn logic programming. Then, we apply this formalization
for establishing results of practical interest.
First, we show that 'stratification can be motivated in a
simple and intuitive way. Relying on similar motivations,
we introduce the larger classes of 'loosely stratified' and
'constructively consistent' programs. Second, we give a
formal basis for introducing quantifiers into queries and
logic programs by defining 'constructively domain
independent* formulas. Third, we extend the Generalized
Magic Sets procedure to loosely stratified and constructively
consistent programs, by relying on a 'conditional
fixpoini procedure
Provably-secure symmetric private information retrieval with quantum cryptography
Private information retrieval (PIR) is a database query protocol that
provides user privacy, in that the user can learn a particular entry of the
database of his interest but his query would be hidden from the data centre.
Symmetric private information retrieval (SPIR) takes PIR further by
additionally offering database privacy, where the user cannot learn any
additional entries of the database. Unconditionally secure SPIR solutions with
multiple databases are known classically, but are unrealistic because they
require long shared secret keys between the parties for secure communication
and shared randomness in the protocol. Here, we propose using quantum key
distribution (QKD) instead for a practical implementation, which can realise
both the secure communication and shared randomness requirements. We prove that
QKD maintains the security of the SPIR protocol and that it is also secure
against any external eavesdropper. We also show how such a classical-quantum
system could be implemented practically, using the example of a two-database
SPIR protocol with keys generated by measurement device-independent QKD.
Through key rate calculations, we show that such an implementation is feasible
at the metropolitan level with current QKD technology.Comment: 19 page
Hypothetical answers to continuous queries over data streams
Continuous queries over data streams may suffer from blocking operations
and/or unbound wait, which may delay answers until some relevant input arrives
through the data stream. These delays may turn answers, when they arrive,
obsolete to users who sometimes have to make decisions with no help whatsoever.
Therefore, it can be useful to provide hypothetical answers - "given the
current information, it is possible that X will become true at time t" -
instead of no information at all.
In this paper we present a semantics for queries and corresponding answers
that covers such hypothetical answers, together with an online algorithm for
updating the set of facts that are consistent with the currently available
information
Integrating and Ranking Uncertain Scientific Data
Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates
- …