6,909 research outputs found
The complexity of acyclic conjunctive queries revisited
In this paper, we consider first-order logic over unary functions and study
the complexity of the evaluation problem for conjunctive queries described by
such kind of formulas. A natural notion of query acyclicity for this language
is introduced and we study the complexity of a large number of variants or
generalizations of acyclic query problems in that context (Boolean or not
Boolean, with or without inequalities, comparisons, etc...). Our main results
show that all those problems are \textit{fixed-parameter linear} i.e. they can
be evaluated in time where is the
size of the query , the database size, is
the size of the output and is some function whose value depends on the
specific variant of the query problem (in some cases, is the identity
function). Our results have two kinds of consequences. First, they can be
easily translated in the relational (i.e., classical) setting. Previously known
bounds for some query problems are improved and new tractable cases are then
exhibited. Among others, as an immediate corollary, we improve a result of
\~\cite{PapadimitriouY-99} by showing that any (relational) acyclic conjunctive
query with inequalities can be evaluated in time
. A second consequence of our method is
that it provides a very natural descriptive approach to the complexity of
well-known algorithmic problems. A number of examples (such as acyclic subgraph
problems, multidimensional matching, etc...) are considered for which new
insights of their complexity are given.Comment: 30 page
Understanding the Complexity of Lifted Inference and Asymmetric Weighted Model Counting
In this paper we study lifted inference for the Weighted First-Order Model
Counting problem (WFOMC), which counts the assignments that satisfy a given
sentence in first-order logic (FOL); it has applications in Statistical
Relational Learning (SRL) and Probabilistic Databases (PDB). We present several
results. First, we describe a lifted inference algorithm that generalizes prior
approaches in SRL and PDB. Second, we provide a novel dichotomy result for a
non-trivial fragment of FO CNF sentences, showing that for each sentence the
WFOMC problem is either in PTIME or #P-hard in the size of the input domain; we
prove that, in the first case our algorithm solves the WFOMC problem in PTIME,
and in the second case it fails. Third, we present several properties of the
algorithm. Finally, we discuss limitations of lifted inference for symmetric
probabilistic databases (where the weights of ground literals depend only on
the relation name, and not on the constants of the domain), and prove the
impossibility of a dichotomy result for the complexity of probabilistic
inference for the entire language FOL
Logics for Unranked Trees: An Overview
Labeled unranked trees are used as a model of XML documents, and logical
languages for them have been studied actively over the past several years. Such
logics have different purposes: some are better suited for extracting data,
some for expressing navigational properties, and some make it easy to relate
complex properties of trees to the existence of tree automata for those
properties. Furthermore, logics differ significantly in their model-checking
properties, their automata models, and their behavior on ordered and unordered
trees. In this paper we present a survey of logics for unranked trees
Monadic Datalog Containment on Trees
We show that the query containment problem for monadic datalog on finite
unranked labeled trees can be solved in 2-fold exponential time when (a)
considering unordered trees using the axes child and descendant, and when (b)
considering ordered trees using the axes firstchild, nextsibling, child, and
descendant. When omitting the descendant-axis, we obtain that in both cases the
problem is EXPTIME-complete.Comment: This article is the full version of an article published in the
proccedings of the 8th Alberto Mendelzon Workshop (AMW 2014
Noise-Tolerant Learning, the Parity Problem, and the Statistical Query Model
We describe a slightly sub-exponential time algorithm for learning parity
functions in the presence of random classification noise. This results in a
polynomial-time algorithm for the case of parity functions that depend on only
the first O(log n log log n) bits of input. This is the first known instance of
an efficient noise-tolerant algorithm for a concept class that is provably not
learnable in the Statistical Query model of Kearns. Thus, we demonstrate that
the set of problems learnable in the statistical query model is a strict subset
of those problems learnable in the presence of noise in the PAC model.
In coding-theory terms, what we give is a poly(n)-time algorithm for decoding
linear k by n codes in the presence of random noise for the case of k = c log n
loglog n for some c > 0. (The case of k = O(log n) is trivial since one can
just individually check each of the 2^k possible messages and choose the one
that yields the closest codeword.)
A natural extension of the statistical query model is to allow queries about
statistical properties that involve t-tuples of examples (as opposed to single
examples). The second result of this paper is to show that any class of
functions learnable (strongly or weakly) with t-wise queries for t = O(log n)
is also weakly learnable with standard unary queries. Hence this natural
extension to the statistical query model does not increase the set of weakly
learnable functions
Circuit Complexity Meets Ontology-Based Data Access
Ontology-based data access is an approach to organizing access to a database
augmented with a logical theory. In this approach query answering proceeds
through a reformulation of a given query into a new one which can be answered
without any use of theory. Thus the problem reduces to the standard database
setting.
However, the size of the query may increase substantially during the
reformulation. In this survey we review a recently developed framework on
proving lower and upper bounds on the size of this reformulation by employing
methods and results from Boolean circuit complexity.Comment: To appear in proceedings of CSR 2015, LNCS 9139, Springe
On the Complexity of Existential Positive Queries
We systematically investigate the complexity of model checking the
existential positive fragment of first-order logic. In particular, for a set of
existential positive sentences, we consider model checking where the sentence
is restricted to fall into the set; a natural question is then to classify
which sentence sets are tractable and which are intractable. With respect to
fixed-parameter tractability, we give a general theorem that reduces this
classification question to the corresponding question for primitive positive
logic, for a variety of representations of structures. This general theorem
allows us to deduce that an existential positive sentence set having bounded
arity is fixed-parameter tractable if and only if each sentence is equivalent
to one in bounded-variable logic. We then use the lens of classical complexity
to study these fixed-parameter tractable sentence sets. We show that such a set
can be NP-complete, and consider the length needed by a translation from
sentences in such a set to bounded-variable logic; we prove superpolynomial
lower bounds on this length using the theory of compilability, obtaining an
interesting type of formula size lower bound. Overall, the tools, concepts, and
results of this article set the stage for the future consideration of the
complexity of model checking on more expressive logics
Privacy-Preserving Secret Shared Computations using MapReduce
Data outsourcing allows data owners to keep their data at \emph{untrusted}
clouds that do not ensure the privacy of data and/or computations. One useful
framework for fault-tolerant data processing in a distributed fashion is
MapReduce, which was developed for \emph{trusted} private clouds. This paper
presents algorithms for data outsourcing based on Shamir's secret-sharing
scheme and for executing privacy-preserving SQL queries such as count,
selection including range selection, projection, and join while using MapReduce
as an underlying programming model. Our proposed algorithms prevent an
adversary from knowing the database or the query while also preventing
output-size and access-pattern attacks. Interestingly, our algorithms do not
involve the database owner, which only creates and distributes secret-shares
once, in answering any query, and hence, the database owner also cannot learn
the query. Logically and experimentally, we evaluate the efficiency of the
algorithms on the following parameters: (\textit{i}) the number of
communication rounds (between a user and a server), (\textit{ii}) the total
amount of bit flow (between a user and a server), and (\textit{iii}) the
computational load at the user and the server.\BComment: IEEE Transactions on Dependable and Secure Computing, Accepted 01
Aug. 201
- …