28 research outputs found
Certifying solution geometry in random CSPs: counts, clusters and balance
An active topic in the study of random constraint satisfaction problems
(CSPs) is the geometry of the space of satisfying or almost satisfying
assignments as the function of the density, for which a precise landscape of
predictions has been made via statistical physics-based heuristics. In
parallel, there has been a recent flurry of work on refuting random constraint
satisfaction problems, via nailing refutation thresholds for spectral and
semidefinite programming-based algorithms, and also on counting solutions to
CSPs. Inspired by this, the starting point for our work is the following
question: what does the solution space for a random CSP look like to an
efficient algorithm?
In pursuit of this inquiry, we focus on the following problems about random
Boolean CSPs at the densities where they are unsatisfiable but no refutation
algorithm is known.
1. Counts. For every Boolean CSP we give algorithms that with high
probability certify a subexponential upper bound on the number of solutions. We
also give algorithms to certify a bound on the number of large cuts in a
Gaussian-weighted graph, and the number of large independent sets in a random
-regular graph.
2. Clusters. For Boolean CSPs we give algorithms that with high
probability certify an upper bound on the number of clusters of solutions.
3. Balance. We also give algorithms that with high probability certify that
there are no "unbalanced" solutions, i.e., solutions where the fraction of
s deviates significantly from .
Finally, we also provide hardness evidence suggesting that our algorithms for
counting are optimal
CSP-Completeness And Its Applications
We build off of previous ideas used to study both reductions between CSPrefutation problems and improper learning and between CSP-refutation problems themselves to expand some hardness results that depend on the assumption that refuting random CSP instances are hard for certain choices of predicates (like k-SAT). First, we are able argue the hardness of the fundamental problem of learning conjunctions in a one-sided PAC-esque learning model that has appeared in several forms over the years. In this model we focus on producing a hypothesis that foremost guarantees a small false-positive rate while minimizing the false-negative rate for such hypotheses. Further, we formalize a notion of CSP-refutation reductions and CSP-refutation completeness that and use these, along with candidate CSP-refutatation complete predicates, to provide further evidence for the hardness of several problems
Pseudo-contractions as Gentle Repairs
Updating a knowledge base to remove an unwanted consequence is a challenging task. Some of the original sentences must be either deleted or weakened in such a way that the sentence to be removed is no longer entailed by the resulting set. On the other hand, it is desirable that the existing knowledge be preserved as much as possible, minimising the loss of information. Several approaches to this problem can be found in the literature. In particular, when the knowledge is represented by an ontology, two different families of frameworks have been developed in the literature in the past decades with numerous ideas in common but with little interaction between the communities: applications of AGM-like Belief Change and justification-based Ontology Repair. In this paper, we investigate the relationship between pseudo-contraction operations and gentle repairs. Both aim to avoid the complete deletion of sentences when replacing them with weaker versions is enough to prevent the entailment of the unwanted formula. We show the correspondence between concepts on both sides and investigate under which conditions they are equivalent. Furthermore, we propose a unified notation for the two approaches, which might contribute to the integration of the two areas
On the Complexity of Random Satisfiability Problems with Planted Solutions
The problem of identifying a planted assignment given a random -SAT
formula consistent with the assignment exhibits a large algorithmic gap: while
the planted solution becomes unique and can be identified given a formula with
clauses, there are distributions over clauses for which the best
known efficient algorithms require clauses. We propose and study a
unified model for planted -SAT, which captures well-known special cases. An
instance is described by a planted assignment and a distribution on
clauses with literals. We define its distribution complexity as the largest
for which the distribution is not -wise independent ( for
any distribution with a planted assignment).
Our main result is an unconditional lower bound, tight up to logarithmic
factors, for statistical (query) algorithms [Kearns 1998, Feldman et. al 2012],
matching known upper bounds, which, as we show, can be implemented using a
statistical algorithm. Since known approaches for problems over distributions
have statistical analogues (spectral, MCMC, gradient-based, convex optimization
etc.), this lower bound provides a rigorous explanation of the observed
algorithmic gap. The proof introduces a new general technique for the analysis
of statistical query algorithms. It also points to a geometric paring
phenomenon in the space of all planted assignments.
We describe consequences of our lower bounds to Feige's refutation hypothesis
[Feige 2002] and to lower bounds on general convex programs that solve planted
-SAT. Our bounds also extend to other planted -CSP models, and, in
particular, provide concrete evidence for the security of Goldreich's one-way
function and the associated pseudorandom generator when used with a
sufficiently hard predicate [Goldreich 2000].Comment: Extended abstract appeared in STOC 201
Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's
We present an algorithm for recovering planted solutions in two well-known
models, the stochastic block model and planted constraint satisfaction
problems, via a common generalization in terms of random bipartite graphs. Our
algorithm matches up to a constant factor the best-known bounds for the number
of edges (or constraints) needed for perfect recovery and its running time is
linear in the number of edges used. The time complexity is significantly better
than both spectral and SDP-based approaches.
The main contribution of the algorithm is in the case of unequal sizes in the
bipartition (corresponding to odd uniformity in the CSP). Here our algorithm
succeeds at a significantly lower density than the spectral approaches,
surpassing a barrier based on the spectral norm of a random matrix.
Other significant features of the algorithm and analysis include (i) the
critical use of power iteration with subsampling, which might be of independent
interest; its analysis requires keeping track of multiple norms of an evolving
solution (ii) it can be implemented statistically, i.e., with very limited
access to the input distribution (iii) the algorithm is extremely simple to
implement and runs in linear time, and thus is practical even for very large
instances