94 research outputs found

    Testing symmetric properties of distributions

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 65-66).We introduce the notion of a Canonical Tester for a class of properties on distributions, that is, a tester strong and general enough that "a distribution property in the class is testable if and only if the Canonical Tester tests it". We construct a Canonical Tester for the class of symmetric properties of one or two distributions, satisfying a certain weak continuity condition. Analyzing the performance of the Canonical Tester on specific properties resolves several open problems, establishing lower bounds that match known upper bounds: we show that distinguishing between entropy p on distributions over [n] requires nc/P-O(1) samples, and distinguishing whether a pair of distributions has statistical distance 0 requires n1-o(1) samples. Our techniques also resolve a conjecture about a property that our Canonical Tester does not apply to: distinguishing identical distributions from those with statistical distance > 0 requires Q(n2/3) samples.by Paul Valiant.Ph.D

    Quantum algorithms for testing properties of distributions

    Get PDF
    Suppose one has access to oracles generating samples from two unknown probability distributions P and Q on some N-element set. How many samples does one need to test whether the two distributions are close or far from each other in the L_1-norm ? This and related questions have been extensively studied during the last years in the field of property testing. In the present paper we study quantum algorithms for testing properties of distributions. It is shown that the L_1-distance between P and Q can be estimated with a constant precision using approximately N^{1/2} queries in the quantum settings, whereas classical computers need \Omega(N) queries. We also describe quantum algorithms for testing Uniformity and Orthogonality with query complexity O(N^{1/3}). The classical query complexity of these problems is known to be \Omega(N^{1/2}).Comment: 20 page

    Testing Properties of Multiple Distributions with Few Samples

    Get PDF
    We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from ss distributions, p1,p2,…,psp_1, p_2, \ldots, p_s, we design testers for the following problems: (1) Uniformity Testing: Testing whether all the pip_i's are uniform or ϵ\epsilon-far from being uniform in ℓ1\ell_1-distance (2) Identity Testing: Testing whether all the pip_i's are equal to an explicitly given distribution qq or ϵ\epsilon-far from qq in ℓ1\ell_1-distance, and (3) Closeness Testing: Testing whether all the pip_i's are equal to a distribution qq which we have sample access to, or ϵ\epsilon-far from qq in ℓ1\ell_1-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.Comment: ITCS 202

    Sharp Bounds for Generalized Uniformity Testing

    Full text link
    We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution pp over an {\em unknown} discrete domain Ω\mathbf{\Omega}, we want to distinguish, with probability at least 2/32/3, between the case that pp is uniform on some {\em subset} of Ω\mathbf{\Omega} versus ϵ\epsilon-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is Θ(1/(ϵ4/3∥p∥3)+1/(ϵ2∥p∥2))\Theta\left(1/(\epsilon^{4/3}\|p\|_3) + 1/(\epsilon^{2} \|p\|_2) \right)

    Generalized uniformity testing

    Get PDF
    In this work, we revisit the problem of uniformity testing of discrete probability distributions. A fundamental problem in distribution testing, testing uniformity over a known domain has been addressed over a significant line of works, and is by now fully understood. The complexity of deciding whether an unknown distribution is uniform over its unknown (and arbitrary) support, however, is much less clear. Yet, this task arises as soon as no prior knowledge on the domain is available, or whenever the samples originate from an unknown and unstructured universe. In this work, we introduce and study this generalized uniformity testing question, and establish nearly tight upper and lower bound showing that – quite surprisingly – its sample complexity significantly differs from the known-domain case. Moreover, our algorithm is intrinsically adaptive, in contrast to the overwhelming majority of known distribution testing algorithms
    • …
    corecore