13 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Algorithms and Certificates for Boolean CSP Refutation: "Smoothed is no harder than Random"
We present an algorithm for strongly refuting smoothed instances of all
Boolean CSPs. The smoothed model is a hybrid between worst and average-case
input models, where the input is an arbitrary instance of the CSP with only the
negation patterns of the literals re-randomized with some small probability.
For an -variable smoothed instance of a -arity CSP, our algorithm runs in
time, and succeeds with high probability in bounding the optimum
fraction of satisfiable constraints away from , provided that the number of
constraints is at least . This
matches, up to polylogarithmic factors in , the trade-off between running
time and the number of constraints of the state-of-the-art algorithms for
refuting fully random instances of CSPs [RRS17].
We also make a surprising new connection between our algorithm and even
covers in hypergraphs, which we use to positively resolve Feige's 2008
conjecture, an extremal combinatorics conjecture on the existence of even
covers in sufficiently dense hypergraphs that generalizes the well-known Moore
bound for the girth of graphs. As a corollary, we show that polynomial-size
refutation witnesses exist for arbitrary smoothed CSP instances with number of
constraints a polynomial factor below the "spectral threshold" of ,
extending the celebrated result for random 3-SAT of Feige, Kim and Ofek
[FKO06]
A simple and sharper proof of the hypergraph Moore bound
The hypergraph Moore bound is an elegant statement that characterizes the
extremal trade-off between the girth - the number of hyperedges in the smallest
cycle or even cover (a subhypergraph with all degrees even) and size - the
number of hyperedges in a hypergraph. For graphs (i.e., -uniform
hypergraphs), a bound tight up to the leading constant was proven in a
classical work of Alon, Hoory and Linial [AHL02]. For hypergraphs of uniformity
, an appropriate generalization was conjectured by Feige [Fei08]. The
conjecture was settled up to an additional factor in the size
in a recent work of Guruswami, Kothari and Manohar [GKM21]. Their argument
relies on a connection between the existence of short even covers and the
spectrum of a certain randomly signed Kikuchi matrix. Their analysis,
especially for the case of odd , is significantly complicated.
In this work, we present a substantially simpler and shorter proof of the
hypergraph Moore bound. Our key idea is the use of a new reweighted Kikuchi
matrix and an edge deletion step that allows us to drop several involved steps
in [GKM21]'s analysis such as combinatorial bucketing of rows of the Kikuchi
matrix and the use of the Schudy-Sviridenko polynomial concentration. Our
simpler proof also obtains tighter parameters: in particular, the argument
gives a new proof of the classical Moore bound of [AHL02] with no loss (the
proof in [GKM21] loses a factor), and loses only a single
logarithmic factor for all -uniform hypergraphs.
As in [GKM21], our ideas naturally extend to yield a simpler proof of the
full trade-off for strongly refuting smoothed instances of constraint
satisfaction problems with similarly improved parameters
Improper Learning by Refuting
The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning.
We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of efficient agnostic learning. Informally, refutation complexity of a class C is the minimum number of example-label pairs required to efficiently distinguish between the case that the labels correlate with the evaluation of some member of C (structure) and the case where the labels are i.i.d. Rademacher random variables (noise). The easy direction of this relationship was implicitly used in the recent framework for improper PAC learning lower bounds of Daniely and co-authors via connections to the hardness of refuting random constraint satisfaction problems. Our work can be seen as making the relationship between agnostic learning and refutation implicit in their work into an explicit equivalence.
In a recent, independent work, Salil Vadhan discovered a similar relationship between refutation and PAC-learning in the realizable (i.e. noiseless) case
Sparser Random 3SAT Refutation Algorithms and the Interpolation Problem:Extended Abstract
We formalize a combinatorial principle, called the 3XOR principle, due to Feige, Kim and Ofek [12], as a family of unsatisfiable propositional formulas for which refutations of small size in any propo-sitional proof system that possesses the feasible interpolation property imply an efficient deterministic refutation algorithm for random 3SAT with n variables and Ω(n1.4) clauses. Such small size refutations would improve the state of the art (with respect to the clause density) efficient refutation algorithm, which works only for Ω(n1.5) many clauses [13]. We demonstrate polynomial-size refutations of the 3XOR principle in resolution operating with disjunctions of quadratic equations with small integer coefficients, denoted R(quad); this is a weak extension of cutting planes with small coefficients. We show that R(quad) is weakly autom-atizable iff R(lin) is weakly automatizable, where R(lin) is similar to R(quad) but with linear instead of quadratic equations (introduced in [25]). This reduces the problem of refuting random 3CNF with n vari-ables and Ω(n1.4) clauses to the interpolation problem of R(quad) and to the weak automatizability of R(lin)
Smoothed Analysis on Connected Graphs
The main paradigm of smoothed analysis on graphs suggests that for any large graph G in a certain class of graphs, perturbing slightly the edges of G at random (usually adding few random edges to G) typically results in a graph having much "nicer" properties. In this work we study smoothed analysis on trees or, equivalently, on connected graphs. Given an n-vertex connected graph G, form a random supergraph of G* of G by turning every pair of vertices of G into an edge with probability epsilon/n, where epsilon is a small positive constant. This perturbation model has been studied previously in several contexts, including smoothed analysis, small world networks, and combinatorics.
Connected graphs can be bad expanders, can have very large diameter, and possibly contain no long paths. In contrast, we show that if G is an n-vertex connected graph then typically G* has edge expansion Omega(1/(log n)), diameter O(log n), vertex expansion Omega(1/(log n)), and contains a path of length Omega(n), where for the last two properties we additionally assume that G has bounded maximum degree. Moreover, we show that if G has bounded degeneracy, then typically the mixing time of the lazy random walk on G* is O(log^2(n)). All these results are asymptotically tight