73 research outputs found

    A polynomial time approximation scheme for computing the supremum of Gaussian processes

    Full text link
    We give a polynomial time approximation scheme (PTAS) for computing the supremum of a Gaussian process. That is, given a finite set of vectors VRdV\subseteq\mathbb{R}^d, we compute a (1+ε)(1+\varepsilon)-factor approximation to EXNd[supvVv,X]\mathop {\mathbb{E}}_{X\leftarrow\mathcal{N}^d}[\sup_{v\in V}|\langle v,X\rangle|] deterministically in time poly(d)VOε(1)\operatorname {poly}(d)\cdot|V|^{O_{\varepsilon}(1)}. Previously, only a constant factor deterministic polynomial time approximation algorithm was known due to the work of Ding, Lee and Peres [Ann. of Math. (2) 175 (2012) 1409-1471]. This answers an open question of Lee (2010) and Ding [Ann. Probab. 42 (2014) 464-496]. The study of supremum of Gaussian processes is of considerable importance in probability with applications in functional analysis, convex geometry, and in light of the recent breakthrough work of Ding, Lee and Peres [Ann. of Math. (2) 175 (2012) 1409-1471], to random walks on finite graphs. As such our result could be of use elsewhere. In particular, combining with the work of Ding [Ann. Probab. 42 (2014) 464-496], our result yields a PTAS for computing the cover time of bounded-degree graphs. Previously, such algorithms were known only for trees. Along the way, we also give an explicit oblivious estimator for semi-norms in Gaussian space with optimal query complexity. Our algorithm and its analysis are elementary in nature, using two classical comparison inequalities, Slepian's lemma and Kanter's lemma.Comment: Published in at http://dx.doi.org/10.1214/13-AAP997 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Learning Graphical Models Using Multiplicative Weights

    Full text link
    We give a simple, multiplicative-weight update algorithm for learning undirected graphical models or Markov random fields (MRFs). The approach is new, and for the well-studied case of Ising models or Boltzmann machines, we obtain an algorithm that uses a nearly optimal number of samples and has quadratic running time (up to logarithmic factors), subsuming and improving on all prior work. Additionally, we give the first efficient algorithm for learning Ising models over general alphabets. Our main application is an algorithm for learning the structure of t-wise MRFs with nearly-optimal sample complexity (up to polynomial losses in necessary terms that depend on the weights) and running time that is nO(t)n^{O(t)}. In addition, given nO(t)n^{O(t)} samples, we can also learn the parameters of the model and generate a hypothesis that is close in statistical distance to the true MRF. All prior work runs in time nΩ(d)n^{\Omega(d)} for graphs of bounded degree d and does not generate a hypothesis close in statistical distance even for t=3. We observe that our runtime has the correct dependence on n and t assuming the hardness of learning sparse parities with noise. Our algorithm--the Sparsitron-- is easy to implement (has only one parameter) and holds in the on-line setting. Its analysis applies a regret bound from Freund and Schapire's classic Hedge algorithm. It also gives the first solution to the problem of learning sparse Generalized Linear Models (GLMs)

    Moment-Matching Polynomials

    Full text link
    We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the "smoothed-analysis" setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice

    A PRG for Lipschitz Functions of Polynomials with Applications to Sparsest Cut

    Full text link
    We give improved pseudorandom generators (PRGs) for Lipschitz functions of low-degree polynomials over the hypercube. These are functions of the form psi(P(x)), where P is a low-degree polynomial and psi is a function with small Lipschitz constant. PRGs for smooth functions of low-degree polynomials have received a lot of attention recently and play an important role in constructing PRGs for the natural class of polynomial threshold functions. In spite of the recent progress, no nontrivial PRGs were known for fooling Lipschitz functions of degree O(log n) polynomials even for constant error rate. In this work, we give the first such generator obtaining a seed-length of (log n)\tilde{O}(d^2/eps^2) for fooling degree d polynomials with error eps. Previous generators had an exponential dependence on the degree. We use our PRG to get better integrality gap instances for sparsest cut, a fundamental problem in graph theory with many applications in graph optimization. We give an instance of uniform sparsest cut for which a powerful semi-definite relaxation (SDP) first introduced by Goemans and Linial and studied in the seminal work of Arora, Rao and Vazirani has an integrality gap of exp(\Omega((log log n)^{1/2})). Understanding the performance of the Goemans-Linial SDP for uniform sparsest cut is an important open problem in approximation algorithms and metric embeddings and our work gives a near-exponential improvement over previous lower bounds which achieved a gap of \Omega(log log n)

    DNF Sparsification and a Faster Deterministic Counting Algorithm

    Full text link
    Given a DNF formula on n variables, the two natural size measures are the number of terms or size s(f), and the maximum width of a term w(f). It is folklore that short DNF formulas can be made narrow. We prove a converse, showing that narrow formulas can be sparsified. More precisely, any width w DNF irrespective of its size can be ϵ\epsilon-approximated by a width ww DNF with at most (wlog(1/ϵ))O(w)(w\log(1/\epsilon))^{O(w)} terms. We combine our sparsification result with the work of Luby and Velikovic to give a faster deterministic algorithm for approximately counting the number of satisfying solutions to a DNF. Given a formula on n variables with poly(n) terms, we give a deterministic nO~(loglog(n))n^{\tilde{O}(\log \log(n))} time algorithm that computes an additive ϵ\epsilon approximation to the fraction of satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best result due to Luby and Velickovic from nearly two decades ago had a run-time of nexp(O(loglogn))n^{\exp(O(\sqrt{\log \log n}))}.Comment: To appear in the IEEE Conference on Computational Complexity, 201

    Pseudorandomness via the discrete Fourier transform

    Full text link
    We present a new approach to constructing unconditional pseudorandom generators against classes of functions that involve computing a linear function of the inputs. We give an explicit construction of a pseudorandom generator that fools the discrete Fourier transforms of linear functions with seed-length that is nearly logarithmic (up to polyloglog factors) in the input size and the desired error parameter. Our result gives a single pseudorandom generator that fools several important classes of tests computable in logspace that have been considered in the literature, including halfspaces (over general domains), modular tests and combinatorial shapes. For all these classes, our generator is the first that achieves near logarithmic seed-length in both the input length and the error parameter. Getting such a seed-length is a natural challenge in its own right, which needs to be overcome in order to derandomize RL - a central question in complexity theory. Our construction combines ideas from a large body of prior work, ranging from a classical construction of [NN93] to the recent gradually increasing independence paradigm of [KMN11, CRSW13, GMRTV12], while also introducing some novel analytic machinery which might find other applications
    corecore