954 research outputs found

    A polynomial time approximation scheme for computing the supremum of Gaussian processes

    Full text link
    We give a polynomial time approximation scheme (PTAS) for computing the supremum of a Gaussian process. That is, given a finite set of vectors VRdV\subseteq\mathbb{R}^d, we compute a (1+ε)(1+\varepsilon)-factor approximation to EXNd[supvVv,X]\mathop {\mathbb{E}}_{X\leftarrow\mathcal{N}^d}[\sup_{v\in V}|\langle v,X\rangle|] deterministically in time poly(d)VOε(1)\operatorname {poly}(d)\cdot|V|^{O_{\varepsilon}(1)}. Previously, only a constant factor deterministic polynomial time approximation algorithm was known due to the work of Ding, Lee and Peres [Ann. of Math. (2) 175 (2012) 1409-1471]. This answers an open question of Lee (2010) and Ding [Ann. Probab. 42 (2014) 464-496]. The study of supremum of Gaussian processes is of considerable importance in probability with applications in functional analysis, convex geometry, and in light of the recent breakthrough work of Ding, Lee and Peres [Ann. of Math. (2) 175 (2012) 1409-1471], to random walks on finite graphs. As such our result could be of use elsewhere. In particular, combining with the work of Ding [Ann. Probab. 42 (2014) 464-496], our result yields a PTAS for computing the cover time of bounded-degree graphs. Previously, such algorithms were known only for trees. Along the way, we also give an explicit oblivious estimator for semi-norms in Gaussian space with optimal query complexity. Our algorithm and its analysis are elementary in nature, using two classical comparison inequalities, Slepian's lemma and Kanter's lemma.Comment: Published in at http://dx.doi.org/10.1214/13-AAP997 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Distributed Data Summarization in Well-Connected Networks

    Get PDF
    We study distributed algorithms for some fundamental problems in data summarization. Given a communication graph G of n nodes each of which may hold a value initially, we focus on computing sum_{i=1}^N g(f_i), where f_i is the number of occurrences of value i and g is some fixed function. This includes important statistics such as the number of distinct elements, frequency moments, and the empirical entropy of the data. In the CONGEST~ model, a simple adaptation from streaming lower bounds shows that it requires Omega~(D+ n) rounds, where D is the diameter of the graph, to compute some of these statistics exactly. However, these lower bounds do not hold for graphs that are well-connected. We give an algorithm that computes sum_{i=1}^{N} g(f_i) exactly in {tau_{G}} * 2^{O(sqrt{log n})} rounds where {tau_{G}} is the mixing time of G. This also has applications in computing the top k most frequent elements. We demonstrate that there is a high similarity between the GOSSIP~ model and the CONGEST~ model in well-connected graphs. In particular, we show that each round of the GOSSIP~ model can be simulated almost perfectly in O~({tau_{G}}) rounds of the CONGEST~ model. To this end, we develop a new algorithm for the GOSSIP~ model that 1 +/- epsilon approximates the p-th frequency moment F_p = sum_{i=1}^N f_i^p in O~(epsilon^{-2} n^{1-k/p}) roundsfor p >= 2, when the number of distinct elements F_0 is at most O(n^{1/(k-1)}). This result can be translated back to the CONGEST~ model with a factor O~({tau_{G}}) blow-up in the number of rounds
    corecore