371 research outputs found

    Learning DNFs under product distributions via {\mu}-biased quantum Fourier sampling

    Full text link
    We show that DNF formulae can be quantum PAC-learned in polynomial time under product distributions using a quantum example oracle. The best classical algorithm (without access to membership queries) runs in superpolynomial time. Our result extends the work by Bshouty and Jackson (1998) that proved that DNF formulae are efficiently learnable under the uniform distribution using a quantum example oracle. Our proof is based on a new quantum algorithm that efficiently samples the coefficients of a {\mu}-biased Fourier transform.Comment: 17 pages; v3 based on journal version; minor corrections and clarification

    Learning DNF Expressions from Fourier Spectrum

    Full text link
    Since its introduction by Valiant in 1984, PAC learning of DNF expressions remains one of the central problems in learning theory. We consider this problem in the setting where the underlying distribution is uniform, or more generally, a product distribution. Kalai, Samorodnitsky and Teng (2009) showed that in this setting a DNF expression can be efficiently approximated from its "heavy" low-degree Fourier coefficients alone. This is in contrast to previous approaches where boosting was used and thus Fourier coefficients of the target function modified by various distributions were needed. This property is crucial for learning of DNF expressions over smoothed product distributions, a learning model introduced by Kalai et al. (2009) and inspired by the seminal smoothed analysis model of Spielman and Teng (2001). We introduce a new approach to learning (or approximating) a polynomial threshold functions which is based on creating a function with range [-1,1] that approximately agrees with the unknown function on low-degree Fourier coefficients. We then describe conditions under which this is sufficient for learning polynomial threshold functions. Our approach yields a new, simple algorithm for approximating any polynomial-size DNF expression from its "heavy" low-degree Fourier coefficients alone. Our algorithm greatly simplifies the proof of learnability of DNF expressions over smoothed product distributions. We also describe an application of our algorithm to learning monotone DNF expressions over product distributions. Building on the work of Servedio (2001), we give an algorithm that runs in time \poly((s \cdot \log{(s/\eps)})^{\log{(s/\eps)}}, n), where ss is the size of the target DNF expression and \eps is the accuracy. This improves on \poly((s \cdot \log{(ns/\eps)})^{\log{(s/\eps)} \cdot \log{(1/\eps)}}, n) bound of Servedio (2001).Comment: Appears in Conference on Learning Theory (COLT) 201

    DNF Sparsification and a Faster Deterministic Counting Algorithm

    Full text link
    Given a DNF formula on n variables, the two natural size measures are the number of terms or size s(f), and the maximum width of a term w(f). It is folklore that short DNF formulas can be made narrow. We prove a converse, showing that narrow formulas can be sparsified. More precisely, any width w DNF irrespective of its size can be ϵ\epsilon-approximated by a width ww DNF with at most (wlog(1/ϵ))O(w)(w\log(1/\epsilon))^{O(w)} terms. We combine our sparsification result with the work of Luby and Velikovic to give a faster deterministic algorithm for approximately counting the number of satisfying solutions to a DNF. Given a formula on n variables with poly(n) terms, we give a deterministic nO~(loglog(n))n^{\tilde{O}(\log \log(n))} time algorithm that computes an additive ϵ\epsilon approximation to the fraction of satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best result due to Luby and Velickovic from nearly two decades ago had a run-time of nexp(O(loglogn))n^{\exp(O(\sqrt{\log \log n}))}.Comment: To appear in the IEEE Conference on Computational Complexity, 201

    Agnostically Learning Boolean Functions with Finite Polynomial Representation

    Get PDF
    Agnostic learning is an extremely hard task in computational learning theory. In this paper we revisit the results in [Kalai et al. SIAM J. Comput. 2008] on agnostically learning boolean functions with finite polynomial representation and those that can be approximated by the former. An example of the former is the class of all boolean low-degree polynomials. For the former, [Kalai et al. SIAM J. Comput. 2008] introduces the l_1-polynomial regression method to learn them to error opt+epsilon. We present a simple instantiation for one step in the method and accordingly give the analysis. Moreover, we show that even ignoring this step can bring a learning result of error 2opt+epsilon as well. Then we consider applying the result for learning concept classes that can be approximated by the former to learn richer specific classes. Our result is that the class of s-term DNF formulae can be agnostically learned to error opt+epsilon with respect to arbitrary distributions for any epsilon in time poly(n^d, 1/epsilon), where d=O(sqrt{n}cdot scdot log slog^2(1/epsilon))

    A Survey of Quantum Learning Theory

    Get PDF
    This paper surveys quantum learning theory: the theoretical aspects of machine learning using quantum computers. We describe the main results known for three models of learning: exact learning from membership queries, and Probably Approximately Correct (PAC) and agnostic learning from classical or quantum examples.Comment: 26 pages LaTeX. v2: many small changes to improve the presentation. This version will appear as Complexity Theory Column in SIGACT News in June 2017. v3: fixed a small ambiguity in the definition of gamma(C) and updated a referenc

    A composition theorem for the Fourier Entropy-Influence conjecture

    Full text link
    The Fourier Entropy-Influence (FEI) conjecture of Friedgut and Kalai [FK96] seeks to relate two fundamental measures of Boolean function complexity: it states that H[f]CInf[f]H[f] \leq C Inf[f] holds for every Boolean function ff, where H[f]H[f] denotes the spectral entropy of ff, Inf[f]Inf[f] is its total influence, and C>0C > 0 is a universal constant. Despite significant interest in the conjecture it has only been shown to hold for a few classes of Boolean functions. Our main result is a composition theorem for the FEI conjecture. We show that if g1,...,gkg_1,...,g_k are functions over disjoint sets of variables satisfying the conjecture, and if the Fourier transform of FF taken with respect to the product distribution with biases E[g1],...,E[gk]E[g_1],...,E[g_k] satisfies the conjecture, then their composition F(g1(x1),...,gk(xk))F(g_1(x^1),...,g_k(x^k)) satisfies the conjecture. As an application we show that the FEI conjecture holds for read-once formulas over arbitrary gates of bounded arity, extending a recent result [OWZ11] which proved it for read-once decision trees. Our techniques also yield an explicit function with the largest known ratio of C6.278C \geq 6.278 between H[f]H[f] and Inf[f]Inf[f], improving on the previous lower bound of 4.615

    From average case complexity to improper learning complexity

    Full text link
    The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms. The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a.k.a. representation independent learning).The difficulty in proving lower bounds for improper learning is that the standard reductions from NP\mathbf{NP}-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in (Kearns and Valiant 89) and relies on cryptographic assumptions. We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption (Feige 02) about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular, 1. Learning DNF\mathrm{DNF}'s is hard. 2. Agnostically learning halfspaces with a constant approximation ratio is hard. 3. Learning an intersection of ω(1)\omega(1) halfspaces is hard.Comment: 34 page
    corecore