61 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Properly Learning Decision Trees with Queries Is NP-Hard
We prove that it is NP-hard to properly PAC learn decision trees with
queries, resolving a longstanding open problem in learning theory (Bshouty
1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While
there has been a long line of work, dating back to (Pitt-Valiant 1988),
establishing the hardness of properly learning decision trees from random
examples, the more challenging setting of query learners necessitates different
techniques and there were no previous lower bounds. En route to our main
result, we simplify and strengthen the best known lower bounds for a different
problem of Decision Tree Minimization (Zantema-Bodlaender 2000; Sieling 2003).
On a technical level, we introduce the notion of hardness distillation, which
we study for decision tree complexity but can be considered for any complexity
measure: for a function that requires large decision trees, we give a general
method for identifying a small set of inputs that is responsible for its
complexity. Our technique even rules out query learners that are allowed
constant error. This contrasts with existing lower bounds for the setting of
random examples which only hold for inverse-polynomial error.
Our result, taken together with a recent almost-polynomial time query
algorithm for properly learning decision trees under the uniform distribution
(Blanc-Lange-Qiao-Tan 2022), demonstrates the dramatic impact of distributional
assumptions on the problem.Comment: 41 pages, 10 figures, FOCS 202
Algorithms and Certificates for Boolean CSP Refutation: "Smoothed is no harder than Random"
We present an algorithm for strongly refuting smoothed instances of all
Boolean CSPs. The smoothed model is a hybrid between worst and average-case
input models, where the input is an arbitrary instance of the CSP with only the
negation patterns of the literals re-randomized with some small probability.
For an -variable smoothed instance of a -arity CSP, our algorithm runs in
time, and succeeds with high probability in bounding the optimum
fraction of satisfiable constraints away from , provided that the number of
constraints is at least . This
matches, up to polylogarithmic factors in , the trade-off between running
time and the number of constraints of the state-of-the-art algorithms for
refuting fully random instances of CSPs [RRS17].
We also make a surprising new connection between our algorithm and even
covers in hypergraphs, which we use to positively resolve Feige's 2008
conjecture, an extremal combinatorics conjecture on the existence of even
covers in sufficiently dense hypergraphs that generalizes the well-known Moore
bound for the girth of graphs. As a corollary, we show that polynomial-size
refutation witnesses exist for arbitrary smoothed CSP instances with number of
constraints a polynomial factor below the "spectral threshold" of ,
extending the celebrated result for random 3-SAT of Feige, Kim and Ofek
[FKO06]
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Black-Box Constructive Proofs Are Unavoidable
Following Razborov and Rudich, a "natural property" for proving a circuit lower bound satisfies three axioms: constructivity, largeness, and usefulness. In 2013, Williams proved that for any reasonable circuit class C, NEXP ? C is equivalent to the existence of a constructive property useful against C. Here, a property is constructive if it can be decided in poly(N) time, where N = 2? is the length of the truth-table of the given n-input function.
Recently, Fan, Li, and Yang initiated the study of black-box natural properties, which require a much stronger notion of constructivity, called black-box constructivity: the property should be decidable in randomized polylog(N) time, given oracle access to the n-input function. They showed that most proofs based on random restrictions yield black-box natural properties, and demonstrated limitations on what black-box natural properties can prove.
In this paper, perhaps surprisingly, we prove that the equivalence of Williams holds even with this stronger notion of black-box constructivity: for any reasonable circuit class C, NEXP ? C is equivalent to the existence of a black-box constructive property useful against C. The main technical ingredient in proving this equivalence is a smooth, strong, and locally-decodable probabilistically checkable proof (PCP), which we construct based on a recent work by Paradise. As a by-product, we show that average-case witness lower bounds for PCP verifiers follow from NEXP lower bounds.
We also show that randomness is essential in the definition of black-box constructivity: we unconditionally prove that there is no deterministic polylog(N)-time constructive property that is useful against even polynomial-size AC? circuits
IOPs with Inverse Polynomial Soundness Error
We show that every language in NP has an Interactive Oracle Proof (IOP) with inverse polynomial soundness error and small query complexity. This achieves parameters that surpass all previously known PCPs and IOPs. Specifically, we construct an IOP with perfect completeness, soundness error , round complexity , proof length over an alphabet of size , and query complexity . This is a step forward in the quest to establish the sliding-scale conjecture for IOPs (which would additionally require query complexity ).
Our main technical contribution is a high-soundness small-query proximity test for the Reed-Solomon code. We construct an IOP of proximity for Reed-Solomon codes, over a field with evaluation domain and degree , with perfect completeness, soundness error (roughly) for -far functions, round complexity , proof length over , and query complexity ; here is the code rate. En route, we obtain a new high-soundness proximity test for bivariate Reed-Muller codes.
The IOP for NP is then obtained via a high-soundness reduction from NP to Reed-Solomon proximity testing with rate and distance (and applying our proximity test). Our constructions are direct and efficient, and hold the potential for practical realizations that would improve the state-of-the-art in real-world applications of IOPs
LIPIcs, Volume 248, ISAAC 2022, Complete Volume
LIPIcs, Volume 248, ISAAC 2022, Complete Volum
EPTAS and Subexponential Algorithm for Maximum Clique on Disk and Unit Ball Graphs
A (unit) disk graph is the intersection graph of closed (unit) disks in the plane. Almost three decades ago, an elegant polynomial-time algorithm was found for Maximum Cliqe on unit disk graphs [Clark, Colbourn, Johnson; Discrete Mathematics ’90]. Since then, it has been an intriguing open question whether or not tractability can be extended to general disk graphs. We show that the disjoint union of two odd cycles is never the complement of a disk graph nor of a unit (3-dimensional) ball graph. From that fact and existing results, we derive a simple QPTAS and a subexponential algorithm running in time 2O˜(n2/3) for Maximum Cliqe on disk and unit ball graphs. We then obtain a randomized EPTAS for computing the independence number on graphs having no disjoint union of two odd cycles as an induced subgraph, bounded VC-dimension, and linear independence number. This, in combination with our structural results, yields a randomized EPTAS for Max Cliqe on disk and unit ball graphs. Max Cliqe on unit ball graphs is equivalent to finding, given a collection of points in R3, a maximum subset of points with diameter at most some fixed value. In stark contrast, Maximum Cliqe on ball graphs and unit 4-dimensional ball graphs, as well as intersection graphs of filled ellipses (even close to unit disks) or filled triangles is unlikely to have such algorithms. Indeed, we show that, for all those problems, there is a constant ratio of approximation which cannot be attained even in time 2n1−ε, unless the Exponential Time Hypothesis fails
- …