184 research outputs found

    Minimizing DNF Formulas and AC 0 Circuits Given a Truth Table

    Get PDF
    For circuit classes R, the fundamental computational problem Min-R asks for the minimum R-size of a Boolean function presented as a truth table. Prominent examples of this problem include Min-DNF, which asks whether a given Boolean function presented as a truth table has a k-term DNF, and Min-Circuit (also called MCSP), which asks whether a Boolean function presented as a truth table has a size k Boolean circuit. We present a new reduction proving that Min-DNF is NP-complete. It is significantly simpler than the known reduction of Masek [31], which is from Circuit-SAT. We then give a more complex reduction, yielding the result that Min-DNF cannot be approximated to within a factor smaller than logN γ, for some constant γ 0, assuming that NP is not contained in quasipolynomial time. The standard greedy algorithm for Set Cover is often used in practice to approximate Min-DNF. The question of whether Min-DNF can be approximated to within a factor of o logN remains open, but we construct an instance of Min-DNF on which the solution produced by the greedy algorithm is Ω logN larger than optimal. Finally, we extend known hardness results for Min-TC0 d to obtain new hardness results for Min-AC0 d, under cryptographic assumptions

    Properly Learning Decision Trees with Queries Is NP-Hard

    Full text link
    We prove that it is NP-hard to properly PAC learn decision trees with queries, resolving a longstanding open problem in learning theory (Bshouty 1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While there has been a long line of work, dating back to (Pitt-Valiant 1988), establishing the hardness of properly learning decision trees from random examples, the more challenging setting of query learners necessitates different techniques and there were no previous lower bounds. En route to our main result, we simplify and strengthen the best known lower bounds for a different problem of Decision Tree Minimization (Zantema-Bodlaender 2000; Sieling 2003). On a technical level, we introduce the notion of hardness distillation, which we study for decision tree complexity but can be considered for any complexity measure: for a function that requires large decision trees, we give a general method for identifying a small set of inputs that is responsible for its complexity. Our technique even rules out query learners that are allowed constant error. This contrasts with existing lower bounds for the setting of random examples which only hold for inverse-polynomial error. Our result, taken together with a recent almost-polynomial time query algorithm for properly learning decision trees under the uniform distribution (Blanc-Lange-Qiao-Tan 2022), demonstrates the dramatic impact of distributional assumptions on the problem.Comment: 41 pages, 10 figures, FOCS 202

    NP-hardness of circuit minimization for multi-output functions

    Get PDF
    Can we design efficient algorithms for finding fast algorithms? This question is captured by various circuit minimization problems, and algorithms for the corresponding tasks have significant practical applications. Following the work of Cook and Levin in the early 1970s, a central question is whether minimizing the circuit size of an explicitly given function is NP-complete. While this is known to hold in restricted models such as DNFs, making progress with respect to more expressive classes of circuits has been elusive. In this work, we establish the first NP-hardness result for circuit minimization of total functions in the setting of general (unrestricted) Boolean circuits. More precisely, we show that computing the minimum circuit size of a given multi-output Boolean function f : {0,1}^n ? {0,1}^m is NP-hard under many-one polynomial-time randomized reductions. Our argument builds on a simpler NP-hardness proof for the circuit minimization problem for (single-output) Boolean functions under an extended set of generators. Complementing these results, we investigate the computational hardness of minimizing communication. We establish that several variants of this problem are NP-hard under deterministic reductions. In particular, unless ? = ??, no polynomial-time computable function can approximate the deterministic two-party communication complexity of a partial Boolean function up to a polynomial. This has consequences for the class of structural results that one might hope to show about the communication complexity of partial functions

    Active classification with comparison queries

    Full text link
    We study an extension of active learning in which the learning algorithm may ask the annotator to compare the distances of two examples from the boundary of their label-class. For example, in a recommendation system application (say for restaurants), the annotator may be asked whether she liked or disliked a specific restaurant (a label query); or which one of two restaurants did she like more (a comparison query). We focus on the class of half spaces, and show that under natural assumptions, such as large margin or bounded bit-description of the input examples, it is possible to reveal all the labels of a sample of size nn using approximately O(logn)O(\log n) queries. This implies an exponential improvement over classical active learning, where only label queries are allowed. We complement these results by showing that if any of these assumptions is removed then, in the worst case, Ω(n)\Omega(n) queries are required. Our results follow from a new general framework of active learning with additional queries. We identify a combinatorial dimension, called the \emph{inference dimension}, that captures the query complexity when each additional query is determined by O(1)O(1) examples (such as comparison queries, each of which is determined by the two compared examples). Our results for half spaces follow by bounding the inference dimension in the cases discussed above.Comment: 23 pages (not including references), 1 figure. The new version contains a minor fix in the proof of Lemma 4.

    Learning Horn Envelopes via Queries from Large Language Models

    Full text link
    We investigate an approach for extracting knowledge from trained neural networks based on Angluin's exact learning model with membership and equivalence queries to an oracle. In this approach, the oracle is a trained neural network. We consider Angluin's classical algorithm for learning Horn theories and study the necessary changes to make it applicable to learn from neural networks. In particular, we have to consider that trained neural networks may not behave as Horn oracles, meaning that their underlying target theory may not be Horn. We propose a new algorithm that aims at extracting the "tightest Horn approximation" of the target theory and that is guaranteed to terminate in exponential time (in the worst case) and in polynomial time if the target has polynomially many non-Horn examples. To showcase the applicability of the approach, we perform experiments on pre-trained language models and extract rules that expose occupation-based gender biases.Comment: 35 pages, 2 figures; manuscript accepted for publication in the International Journal of Approximate Reasoning (IJAR

    Conspiracies Between Learning Algorithms, Circuit Lower Bounds, and Pseudorandomness

    Get PDF
    We prove several results giving new and stronger connections between learning theory, circuit complexity and pseudorandomness. Let C be any typical class of Boolean circuits, and C[s(n)] denote n-variable C-circuits of size <= s(n). We show: Learning Speedups: If C[s(n)] admits a randomized weak learning algorithm under the uniform distribution with membership queries that runs in time 2^n/n^{omega(1)}, then for every k >= 1 and epsilon > 0 the class C[n^k] can be learned to high accuracy in time O(2^{n^epsilon}). There is epsilon > 0 such that C[2^{n^{epsilon}}] can be learned in time 2^n/n^{omega(1)} if and only if C[poly(n)] can be learned in time 2^{(log(n))^{O(1)}}. Equivalences between Learning Models: We use learning speedups to obtain equivalences between various randomized learning and compression models, including sub-exponential time learning with membership queries, sub-exponential time learning with membership and equivalence queries, probabilistic function compression and probabilistic average-case function compression. A Dichotomy between Learnability and Pseudorandomness: In the non-uniform setting, there is non-trivial learning for C[poly(n)] if and only if there are no exponentially secure pseudorandom functions computable in C[poly(n)]. Lower Bounds from Nontrivial Learning: If for each k >= 1, (depth-d)-C[n^k] admits a randomized weak learning algorithm with membership queries under the uniform distribution that runs in time 2^n/n^{omega(1)}, then for each k >= 1, BPE is not contained in (depth-d)-C[n^k]. If for some epsilon > 0 there are P-natural proofs useful against C[2^{n^{epsilon}}], then ZPEXP is not contained in C[poly(n)]. Karp-Lipton Theorems for Probabilistic Classes: If there is a k > 0 such that BPE is contained in i.o.Circuit[n^k], then BPEXP is contained in i.o.EXP/O(log(n)). If ZPEXP is contained in i.o.Circuit[2^{n/3}], then ZPEXP is contained in i.o.ESUBEXP. Hardness Results for MCSP: All functions in non-uniform NC^1 reduce to the Minimum Circuit Size Problem via truth-table reductions computable by TC^0 circuits. In particular, if MCSP is in TC^0 then NC^1 = TC^0
    corecore