110 research outputs found
Testing Properties of Multiple Distributions with Few Samples
We propose a new setting for testing properties of distributions while
receiving samples from several distributions, but few samples per distribution.
Given samples from distributions, , we design
testers for the following problems: (1) Uniformity Testing: Testing whether all
the 's are uniform or -far from being uniform in
-distance (2) Identity Testing: Testing whether all the 's are
equal to an explicitly given distribution or -far from in
-distance, and (3) Closeness Testing: Testing whether all the 's
are equal to a distribution which we have sample access to, or
-far from in -distance. By assuming an additional natural
condition about the source distributions, we provide sample optimal testers for
all of these problems.Comment: ITCS 202
New Results on Quantum Property Testing
We present several new examples of speed-ups obtainable by quantum algorithms
in the context of property testing. First, motivated by sampling algorithms, we
consider probability distributions given in the form of an oracle
. Here the probability \PP_f(j) of an outcome is the
fraction of its domain that maps to . We give quantum algorithms for
testing whether two such distributions are identical or -far in
-norm. Recently, Bravyi, Hassidim, and Harrow \cite{BHH10} showed that if
\PP_f and \PP_g are both unknown (i.e., given by oracles and ), then
this testing can be done in roughly quantum queries to the
functions. We consider the case where the second distribution is known, and
show that testing can be done with roughly quantum queries, which we
prove to be essentially optimal. In contrast, it is known that classical
testing algorithms need about queries in the unknown-unknown case and
about queries in the known-unknown case. Based on this result, we
also reduce the query complexity of graph isomorphism testers with quantum
oracle access. While those examples provide polynomial quantum speed-ups, our
third example gives a much larger improvement (constant quantum queries vs
polynomial classical queries) for the problem of testing periodicity, based on
Shor's algorithm and a modification of a classical lower bound by Lachish and
Newman \cite{lachish&newman:periodicity}. This provides an alternative to a
recent constant-vs-polynomial speed-up due to Aaronson \cite{aaronson:bqpph}.Comment: 2nd version: updated some references, in particular to Aaronson's
Fourier checking proble
Succinct quantum testers for closeness and -wise uniformity of probability distributions
We explore potential quantum speedups for the fundamental problem of testing
the properties of closeness and -wise uniformity of probability
distributions.
\textit{Closeness testing} is the problem of distinguishing whether two
-dimensional distributions are identical or at least -far in
- or -distance. We show that the quantum query complexities for
- and -closeness testing are O\rbra{\sqrt{n}/\varepsilon} and
O\rbra{1/\varepsilon}, respectively, both of which achieve optimal dependence
on , improving the prior best results of
\hyperlink{cite.gilyen2019distributional}{Gily{\'e}n and Li~(2019)}.
\textit{-wise uniformity testing} is the problem of distinguishing whether
a distribution over \cbra{0, 1}^n is uniform when restricted to any
coordinates or -far from any such distributions. We propose the
first quantum algorithm for this problem with query complexity
O\rbra{\sqrt{n^k}/\varepsilon}, achieving a quadratic speedup over the
state-of-the-art classical algorithm with sample complexity
O\rbra{n^k/\varepsilon^2} by \hyperlink{cite.o2018closeness}{O'Donnell and
Zhao (2018)}. Moreover, when our quantum algorithm outperforms any
classical one because of the classical lower bound
\Omega\rbra{n/\varepsilon^2}.
All our quantum algorithms are fairly simple and time-efficient, using only
basic quantum subroutines such as amplitude estimation.Comment: We have added the proof of lower bounds and have polished the
languag
Comparison Graphs: A Unified Method for Uniformity Testing
Distribution testing can be described as follows: samples are being drawn
from some unknown distribution over a known domain . After the
sampling process, a decision must be made about whether holds some
property, or is far from it. The most studied problem in the field is arguably
uniformity testing, where one needs to distinguish the case that is uniform
over from the case that is -far from being uniform (in
). In the classic model, it is known that
samples are necessary and sufficient
for this task. This problem was recently considered in various restricted
models that pose, for example, communication or memory constraints. In more
than one occasion, the known optimal solution boils down to counting collisions
among the drawn samples (each two samples that have the same value add one to
the count), an idea that dates back to the first uniformity tester, and was
coined the name "collision-based tester".
In this paper, we introduce the notion of comparison graphs and use it to
formally define a generalized collision-based tester. Roughly speaking, the
edges of the graph indicate the tester which pairs of samples should be
compared (that is, the original tester is induced by a clique, where all pairs
are being compared). We prove a structural theorem that gives a sufficient
condition for a comparison graph to induce a good uniformity tester. As an
application, we develop a generic method to test uniformity, and devise
nearly-optimal uniformity testers under various computational constraints. We
improve and simplify a few known results, and introduce a new constrained model
in which the method also produces an efficient tester.
The idea behind our method is to translate computational constraints of a
certain model to ones on the comparison graph, which paves the way to finding a
good graph
Sample-Optimal Identity Testing with High Probability
We study the problem of testing identity against a given distribution with a focus on the high confidence regime. More precisely, given samples from an unknown distribution p over n elements, an explicitly given distribution q, and parameters 0< epsilon, delta < 1, we wish to distinguish, with probability at least 1-delta, whether the distributions are identical versus epsilon-far in total variation distance. Most prior work focused on the case that delta = Omega(1), for which the sample complexity of identity testing is known to be Theta(sqrt{n}/epsilon^2). Given such an algorithm, one can achieve arbitrarily small values of delta via black-box amplification, which multiplies the required number of samples by Theta(log(1/delta)).
We show that black-box amplification is suboptimal for any delta = o(1), and give a new identity tester that achieves the optimal sample complexity. Our new upper and lower bounds show that the optimal sample complexity of identity testing is Theta((1/epsilon^2) (sqrt{n log(1/delta)} + log(1/delta))) for any n, epsilon, and delta. For the special case of uniformity testing, where the given distribution is the uniform distribution U_n over the domain, our new tester is surprisingly simple: to test whether p = U_n versus d_{TV} (p, U_n) >= epsilon, we simply threshold d_{TV}({p^}, U_n), where {p^} is the empirical probability distribution. The fact that this simple "plug-in" estimator is sample-optimal is surprising, even in the constant delta case. Indeed, it was believed that such a tester would not attain sublinear sample complexity even for constant values of epsilon and delta.
An important contribution of this work lies in the analysis techniques that we introduce in this context. First, we exploit an underlying strong convexity property to bound from below the expectation gap in the completeness and soundness cases. Second, we give a new, fast method for obtaining provably correct empirical estimates of the true worst-case failure probability for a broad class of uniformity testing statistics over all possible input distributions - including all previously studied statistics for this problem. We believe that our novel analysis techniques will be useful for other distribution testing problems as well
- …