94 research outputs found
Testing symmetric properties of distributions
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 65-66).We introduce the notion of a Canonical Tester for a class of properties on distributions, that is, a tester strong and general enough that "a distribution property in the class is testable if and only if the Canonical Tester tests it". We construct a Canonical Tester for the class of symmetric properties of one or two distributions, satisfying a certain weak continuity condition. Analyzing the performance of the Canonical Tester on specific properties resolves several open problems, establishing lower bounds that match known upper bounds: we show that distinguishing between entropy p on distributions over [n] requires nc/P-O(1) samples, and distinguishing whether a pair of distributions has statistical distance 0 requires n1-o(1) samples. Our techniques also resolve a conjecture about a property that our Canonical Tester does not apply to: distinguishing identical distributions from those with statistical distance > 0 requires Q(n2/3) samples.by Paul Valiant.Ph.D
Quantum algorithms for testing properties of distributions
Suppose one has access to oracles generating samples from two unknown
probability distributions P and Q on some N-element set. How many samples does
one need to test whether the two distributions are close or far from each other
in the L_1-norm ? This and related questions have been extensively studied
during the last years in the field of property testing. In the present paper we
study quantum algorithms for testing properties of distributions. It is shown
that the L_1-distance between P and Q can be estimated with a constant
precision using approximately N^{1/2} queries in the quantum settings, whereas
classical computers need \Omega(N) queries. We also describe quantum algorithms
for testing Uniformity and Orthogonality with query complexity O(N^{1/3}). The
classical query complexity of these problems is known to be \Omega(N^{1/2}).Comment: 20 page
Testing Properties of Multiple Distributions with Few Samples
We propose a new setting for testing properties of distributions while
receiving samples from several distributions, but few samples per distribution.
Given samples from distributions, , we design
testers for the following problems: (1) Uniformity Testing: Testing whether all
the 's are uniform or -far from being uniform in
-distance (2) Identity Testing: Testing whether all the 's are
equal to an explicitly given distribution or -far from in
-distance, and (3) Closeness Testing: Testing whether all the 's
are equal to a distribution which we have sample access to, or
-far from in -distance. By assuming an additional natural
condition about the source distributions, we provide sample optimal testers for
all of these problems.Comment: ITCS 202
Sharp Bounds for Generalized Uniformity Testing
We study the problem of generalized uniformity testing \cite{BC17} of a
discrete probability distribution: Given samples from a probability
distribution over an {\em unknown} discrete domain , we
want to distinguish, with probability at least , between the case that
is uniform on some {\em subset} of versus -far, in
total variation distance, from any such uniform distribution.
We establish tight bounds on the sample complexity of generalized uniformity
testing. In more detail, we present a computationally efficient tester whose
sample complexity is optimal, up to constant factors, and a matching
information-theoretic lower bound. Specifically, we show that the sample
complexity of generalized uniformity testing is
Generalized uniformity testing
In this work, we revisit the problem of uniformity testing of discrete probability distributions. A fundamental problem in distribution testing, testing uniformity over a known domain has been addressed over a significant line of works, and is by now fully understood. The complexity of deciding whether an unknown distribution is uniform over its unknown (and arbitrary) support, however, is much less clear. Yet, this task arises as soon as no prior knowledge on the domain is available, or whenever the samples originate from an unknown and unstructured universe. In this work, we introduce and study this generalized uniformity testing question, and establish nearly tight upper and lower bound showing that – quite surprisingly – its sample complexity significantly differs from the known-domain case. Moreover, our algorithm is intrinsically adaptive, in contrast to the overwhelming majority of known distribution testing algorithms
- …