35,724 research outputs found
Engineering an Efficient PB-XOR Solver
Despite the NP-completeness of Boolean satisfiability, modern SAT solvers are routinely able to handle large practical instances, and consequently have found wide ranging applications. The primary workhorse behind the success of SAT solvers is the widely acclaimed Conflict Driven Clause Learning (CDCL) paradigm, which was originally proposed in the context of Boolean formulas in CNF. The wide ranging applications of SAT solvers have highlighted that for several domains, CNF is not a natural representation and the reliance of modern SAT solvers on resolution proof system limit their ability to efficiently solve several families of constraints. Consequently, the past decade has witnessed the design of solvers with native support for constraints such as Pseudo-Boolean (PB) and CNF-XOR.
The primary contribution of our work is an efficient solver engineered for PB-XOR formulas, i.e., formulas consisting of a conjunction of PB and XOR constraints. We first observe that a simple adaption of CNF-XOR architecture does not provide an improvement over baseline; our analysis highlights the need for careful engineering of the order of propagations. To this end, we propose three different tactics, all of which achieve significant performance improvements over the baseline. Our work is motivated by applications arising from binarized neural network verification where the verification of properties such as robustness, fairness, trojan attacks can be reduced to model counting queries; the state of the art model counters reduce counting to polynomially many SAT queries over the original formula conjuncted with randomly generated XOR constraints. To this end, we augment ApproxMC with LinPB and we call the resulting counter as ApproxMCPB. In an extensive empirical comparison over 1076 benchmarks, we observe that ApproxMCPB can solve 912 instances while the baseline version of ApproxMC4 (augmented with CryptoMiniSat) can solve only 802 instances
Global Numerical Constraints on Trees
We introduce a logical foundation to reason on tree structures with
constraints on the number of node occurrences. Related formalisms are limited
to express occurrence constraints on particular tree regions, as for instance
the children of a given node. By contrast, the logic introduced in the present
work can concisely express numerical bounds on any region, descendants or
ancestors for instance. We prove that the logic is decidable in single
exponential time even if the numerical constraints are in binary form. We also
illustrate the usage of the logic in the description of numerical constraints
on multi-directional path queries on XML documents. Furthermore, numerical
restrictions on regular languages (XML schemas) can also be concisely described
by the logic. This implies a characterization of decidable counting extensions
of XPath queries and XML schemas. Moreover, as the logic is closed under
negation, it can thus be used as an optimal reasoning framework for testing
emptiness, containment and equivalence
Bit-Vector Model Counting using Statistical Estimation
Approximate model counting for bit-vector SMT formulas (generalizing \#SAT)
has many applications such as probabilistic inference and quantitative
information-flow security, but it is computationally difficult. Adding random
parity constraints (XOR streamlining) and then checking satisfiability is an
effective approximation technique, but it requires a prior hypothesis about the
model count to produce useful results. We propose an approach inspired by
statistical estimation to continually refine a probabilistic estimate of the
model count for a formula, so that each XOR-streamlined query yields as much
information as possible. We implement this approach, with an approximate
probability model, as a wrapper around an off-the-shelf SMT solver or SAT
solver. Experimental results show that the implementation is faster than the
most similar previous approaches which used simpler refinement strategies. The
technique also lets us model count formulas over floating-point constraints,
which we demonstrate with an application to a vulnerability in differential
privacy mechanisms
An Improved Private Mechanism for Small Databases
We study the problem of answering a workload of linear queries ,
on a database of size at most drawn from a universe
under the constraint of (approximate) differential privacy.
Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for
any given and , answers the queries with average error that is
at most a factor polynomial in and
worse than the best possible. Here we improve on this guarantee and give a
mechanism whose competitiveness ratio is at most polynomial in and
, and has no dependence on . Our mechanism
is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in
place of an ad-hoc noise distribution, we use a distribution which is in a
sense optimal for the projection mechanism, and analyze it using convex duality
and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track
Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
Differential privacy is a promising privacy-preserving paradigm for
statistical query processing over sensitive data. It works by injecting random
noise into each query result, such that it is provably hard for the adversary
to infer the presence or absence of any individual record from the published
noisy results. The main objective in differentially private query processing is
to maximize the accuracy of the query results, while satisfying the privacy
guarantees. Previous work, notably \cite{LHR+10}, has suggested that with an
appropriate strategy, processing a batch of correlated queries as a whole
achieves considerably higher accuracy than answering them individually.
However, to our knowledge there is currently no practical solution to find such
a strategy for an arbitrary query batch; existing methods either return
strategies of poor quality (often worse than naive methods) or require
prohibitively expensive computations for even moderately large domains.
Motivated by this, we propose low-rank mechanism (LRM), the first practical
differentially private technique for answering batch linear queries with high
accuracy. LRM works for both exact (i.e., -) and approximate (i.e.,
(, )-) differential privacy definitions. We derive the
utility guarantees of LRM, and provide guidance on how to set the privacy
parameters given the user's utility expectation. Extensive experiments using
real data demonstrate that our proposed method consistently outperforms
state-of-the-art query processing solutions under differential privacy, by
large margins.Comment: ACM Transactions on Database Systems (ACM TODS). arXiv admin note:
text overlap with arXiv:1212.230
Boosting the Accuracy of Differentially-Private Histograms Through Consistency
We show that it is possible to significantly improve the accuracy of a
general class of histogram queries while satisfying differential privacy. Our
approach carefully chooses a set of queries to evaluate, and then exploits
consistency constraints that should hold over the noisy output. In a
post-processing phase, we compute the consistent input most likely to have
produced the noisy output. The final output is differentially-private and
consistent, but in addition, it is often much more accurate. We show, both
theoretically and experimentally, that these techniques can be used for
estimating the degree sequence of a graph very precisely, and for computing a
histogram that can support arbitrary range queries accurately.Comment: 15 pages, 7 figures, minor revisions to previous versio
On the Complexity of Query Result Diversification
Query result diversification is a bi-criteria optimization problem for ranking query results. Given a database D, a query Q and a positive integer k, it is to find a set of k tuples from Q(D) such that the tuples are as relevant as possible to the query, and at the same time, as diverse as possible to each other. Subsets of Q(D) are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval and operations research. This paper studies the complexity of result diversification for relational queries. We identify three problems in connection with query result diversification, to determine whether there exists a set of k tuples that is ranked above a bound with respect to relevance and diversity, to assess the rank of a given k-element set, and to count how many k-element sets are ranked above a given bound. We study these problems for a variety of query languages and for three objective functions. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. We also investigate several special settings of these problems, identifying tractable cases. 1
- …