61 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Properly Learning Decision Trees with Queries Is NP-Hard

    Full text link
    We prove that it is NP-hard to properly PAC learn decision trees with queries, resolving a longstanding open problem in learning theory (Bshouty 1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While there has been a long line of work, dating back to (Pitt-Valiant 1988), establishing the hardness of properly learning decision trees from random examples, the more challenging setting of query learners necessitates different techniques and there were no previous lower bounds. En route to our main result, we simplify and strengthen the best known lower bounds for a different problem of Decision Tree Minimization (Zantema-Bodlaender 2000; Sieling 2003). On a technical level, we introduce the notion of hardness distillation, which we study for decision tree complexity but can be considered for any complexity measure: for a function that requires large decision trees, we give a general method for identifying a small set of inputs that is responsible for its complexity. Our technique even rules out query learners that are allowed constant error. This contrasts with existing lower bounds for the setting of random examples which only hold for inverse-polynomial error. Our result, taken together with a recent almost-polynomial time query algorithm for properly learning decision trees under the uniform distribution (Blanc-Lange-Qiao-Tan 2022), demonstrates the dramatic impact of distributional assumptions on the problem.Comment: 41 pages, 10 figures, FOCS 202

    Algorithms and Certificates for Boolean CSP Refutation: "Smoothed is no harder than Random"

    Full text link
    We present an algorithm for strongly refuting smoothed instances of all Boolean CSPs. The smoothed model is a hybrid between worst and average-case input models, where the input is an arbitrary instance of the CSP with only the negation patterns of the literals re-randomized with some small probability. For an nn-variable smoothed instance of a kk-arity CSP, our algorithm runs in nO()n^{O(\ell)} time, and succeeds with high probability in bounding the optimum fraction of satisfiable constraints away from 11, provided that the number of constraints is at least O~(n)(n)k21\tilde{O}(n) (\frac{n}{\ell})^{\frac{k}{2} - 1}. This matches, up to polylogarithmic factors in nn, the trade-off between running time and the number of constraints of the state-of-the-art algorithms for refuting fully random instances of CSPs [RRS17]. We also make a surprising new connection between our algorithm and even covers in hypergraphs, which we use to positively resolve Feige's 2008 conjecture, an extremal combinatorics conjecture on the existence of even covers in sufficiently dense hypergraphs that generalizes the well-known Moore bound for the girth of graphs. As a corollary, we show that polynomial-size refutation witnesses exist for arbitrary smoothed CSP instances with number of constraints a polynomial factor below the "spectral threshold" of nk/2n^{k/2}, extending the celebrated result for random 3-SAT of Feige, Kim and Ofek [FKO06]

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Black-Box Constructive Proofs Are Unavoidable

    Get PDF
    Following Razborov and Rudich, a "natural property" for proving a circuit lower bound satisfies three axioms: constructivity, largeness, and usefulness. In 2013, Williams proved that for any reasonable circuit class C, NEXP ? C is equivalent to the existence of a constructive property useful against C. Here, a property is constructive if it can be decided in poly(N) time, where N = 2? is the length of the truth-table of the given n-input function. Recently, Fan, Li, and Yang initiated the study of black-box natural properties, which require a much stronger notion of constructivity, called black-box constructivity: the property should be decidable in randomized polylog(N) time, given oracle access to the n-input function. They showed that most proofs based on random restrictions yield black-box natural properties, and demonstrated limitations on what black-box natural properties can prove. In this paper, perhaps surprisingly, we prove that the equivalence of Williams holds even with this stronger notion of black-box constructivity: for any reasonable circuit class C, NEXP ? C is equivalent to the existence of a black-box constructive property useful against C. The main technical ingredient in proving this equivalence is a smooth, strong, and locally-decodable probabilistically checkable proof (PCP), which we construct based on a recent work by Paradise. As a by-product, we show that average-case witness lower bounds for PCP verifiers follow from NEXP lower bounds. We also show that randomness is essential in the definition of black-box constructivity: we unconditionally prove that there is no deterministic polylog(N)-time constructive property that is useful against even polynomial-size AC? circuits

    IOPs with Inverse Polynomial Soundness Error

    Get PDF
    We show that every language in NP has an Interactive Oracle Proof (IOP) with inverse polynomial soundness error and small query complexity. This achieves parameters that surpass all previously known PCPs and IOPs. Specifically, we construct an IOP with perfect completeness, soundness error 1/n1/n, round complexity O(loglogn)O(\log \log n), proof length poly(n)poly(n) over an alphabet of size O(n)O(n), and query complexity O(loglogn)O(\log \log n). This is a step forward in the quest to establish the sliding-scale conjecture for IOPs (which would additionally require query complexity O(1)O(1)). Our main technical contribution is a high-soundness small-query proximity test for the Reed-Solomon code. We construct an IOP of proximity for Reed-Solomon codes, over a field F\mathbb{F} with evaluation domain LL and degree dd, with perfect completeness, soundness error (roughly) max{1δ,O(ρ1/4)}\max\{1-\delta , O(\rho^{1/4})\} for δ\delta-far functions, round complexity O(loglogd)O(\log \log d), proof length O(L/ρ)O(|L|/\rho) over F\mathbb{F}, and query complexity O(loglogd)O(\log \log d); here ρ=(d+1)/L\rho = (d+1)/|L| is the code rate. En route, we obtain a new high-soundness proximity test for bivariate Reed-Muller codes. The IOP for NP is then obtained via a high-soundness reduction from NP to Reed-Solomon proximity testing with rate ρ=1/poly(n)\rho = 1/poly(n) and distance δ=11/poly(n)\delta = 1-1/poly(n) (and applying our proximity test). Our constructions are direct and efficient, and hold the potential for practical realizations that would improve the state-of-the-art in real-world applications of IOPs

    LIPIcs, Volume 248, ISAAC 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 248, ISAAC 2022, Complete Volum

    EPTAS and Subexponential Algorithm for Maximum Clique on Disk and Unit Ball Graphs

    Get PDF
    A (unit) disk graph is the intersection graph of closed (unit) disks in the plane. Almost three decades ago, an elegant polynomial-time algorithm was found for Maximum Cliqe on unit disk graphs [Clark, Colbourn, Johnson; Discrete Mathematics ’90]. Since then, it has been an intriguing open question whether or not tractability can be extended to general disk graphs. We show that the disjoint union of two odd cycles is never the complement of a disk graph nor of a unit (3-dimensional) ball graph. From that fact and existing results, we derive a simple QPTAS and a subexponential algorithm running in time 2O˜(n2/3) for Maximum Cliqe on disk and unit ball graphs. We then obtain a randomized EPTAS for computing the independence number on graphs having no disjoint union of two odd cycles as an induced subgraph, bounded VC-dimension, and linear independence number. This, in combination with our structural results, yields a randomized EPTAS for Max Cliqe on disk and unit ball graphs. Max Cliqe on unit ball graphs is equivalent to finding, given a collection of points in R3, a maximum subset of points with diameter at most some fixed value. In stark contrast, Maximum Cliqe on ball graphs and unit 4-dimensional ball graphs, as well as intersection graphs of filled ellipses (even close to unit disks) or filled triangles is unlikely to have such algorithms. Indeed, we show that, for all those problems, there is a constant ratio of approximation which cannot be attained even in time 2n1−ε, unless the Exponential Time Hypothesis fails
    corecore