12,764 research outputs found
On active and passive testing
Given a property of Boolean functions, what is the minimum number of queries
required to determine with high probability if an input function satisfies this
property or is "far" from satisfying it? This is a fundamental question in
Property Testing, where traditionally the testing algorithm is allowed to pick
its queries among the entire set of inputs. Balcan, Blais, Blum and Yang have
recently suggested to restrict the tester to take its queries from a smaller
random subset of polynomial size of the inputs. This model is called active
testing, and in the extreme case when the size of the set we can query from is
exactly the number of queries performed it is known as passive testing.
We prove that passive or active testing of k-linear functions (that is, sums
of k variables among n over Z_2) requires Theta(k*log n) queries, assuming k is
not too large. This extends the case k=1, (that is, dictator functions),
analyzed by Balcan et. al.
We also consider other classes of functions including low degree polynomials,
juntas, and partially symmetric functions. Our methods combine algebraic,
combinatorial, and probabilistic techniques, including the Talagrand
concentration inequality and the Erdos--Rado theorem on Delta-systems.Comment: 16 page
Testing Conditional Independence of Discrete Distributions
We study the problem of testing \emph{conditional independence} for discrete
distributions. Specifically, given samples from a discrete random variable on domain , we want to distinguish,
with probability at least , between the case that and are
conditionally independent given from the case that is
-far, in -distance, from every distribution that has this
property. Conditional independence is a concept of central importance in
probability and statistics with a range of applications in various scientific
domains. As such, the statistical task of testing conditional independence has
been extensively studied in various forms within the statistics and
econometrics communities for nearly a century. Perhaps surprisingly, this
problem has not been previously considered in the framework of distribution
property testing and in particular no tester with sublinear sample complexity
is known, even for the important special case that the domains of and
are binary.
The main algorithmic result of this work is the first conditional
independence tester with {\em sublinear} sample complexity for discrete
distributions over . To complement our upper
bounds, we prove information-theoretic lower bounds establishing that the
sample complexity of our algorithm is optimal, up to constant factors, for a
number of settings. Specifically, for the prototypical setting when , we show that the sample complexity of testing conditional
independence (upper bound and matching lower bound) is
\[
\Theta\left({\max\left(n^{1/2}/\epsilon^2,\min\left(n^{7/8}/\epsilon,n^{6/7}/\epsilon^{8/7}\right)\right)}\right)\,.
\
Low-degree tests at large distances
We define tests of boolean functions which distinguish between linear (or
quadratic) polynomials, and functions which are very far, in an appropriate
sense, from these polynomials. The tests have optimal or nearly optimal
trade-offs between soundness and the number of queries.
In particular, we show that functions with small Gowers uniformity norms
behave ``randomly'' with respect to hypergraph linearity tests.
A central step in our analysis of quadraticity tests is the proof of an
inverse theorem for the third Gowers uniformity norm of boolean functions.
The last result has also a coding theory application. It is possible to
estimate efficiently the distance from the second-order Reed-Muller code on
inputs lying far beyond its list-decoding radius
Quantum query complexity of entropy estimation
Estimation of Shannon and R\'enyi entropies of unknown discrete distributions
is a fundamental problem in statistical property testing and an active research
topic in both theoretical computer science and information theory. Tight bounds
on the number of samples to estimate these entropies have been established in
the classical setting, while little is known about their quantum counterparts.
In this paper, we give the first quantum algorithms for estimating
-R\'enyi entropies (Shannon entropy being 1-Renyi entropy). In
particular, we demonstrate a quadratic quantum speedup for Shannon entropy
estimation and a generic quantum speedup for -R\'enyi entropy
estimation for all , including a tight bound for the
collision-entropy (2-R\'enyi entropy). We also provide quantum upper bounds for
extreme cases such as the Hartley entropy (i.e., the logarithm of the support
size of a distribution, corresponding to ) and the min-entropy case
(i.e., ), as well as the Kullback-Leibler divergence between
two distributions. Moreover, we complement our results with quantum lower
bounds on -R\'enyi entropy estimation for all .Comment: 43 pages, 1 figur
Lower bounds for adaptive linearity tests
Linearity tests are randomized algorithms which have oracle access to the
truth table of some function f, and are supposed to distinguish between linear
functions and functions which are far from linear. Linearity tests were first
introduced by (Blum, Luby and Rubenfeld, 1993), and were later used in the PCP
theorem, among other applications. The quality of a linearity test is described
by its correctness c - the probability it accepts linear functions, its
soundness s - the probability it accepts functions far from linear, and its
query complexity q - the number of queries it makes. Linearity tests were
studied in order to decrease the soundness of linearity tests, while keeping
the query complexity small (for one reason, to improve PCP constructions).
Samorodnitsky and Trevisan (Samorodnitsky and Trevisan 2000) constructed the
Complete Graph Test, and prove that no Hyper Graph Test can perform better than
the Complete Graph Test. Later in (Samorodnitsky and Trevisan 2006) they prove,
among other results, that no non-adaptive linearity test can perform better
than the Complete Graph Test. Their proof uses the algebraic machinery of the
Gowers Norm. A result by (Ben-Sasson, Harsha and Raskhodnikova 2005) allows to
generalize this lower bound also to adaptive linearity tests. We also prove the
same optimal lower bound for adaptive linearity test, but our proof technique
is arguably simpler and more direct than the one used in (Samorodnitsky and
Trevisan 2006). We also study, like (Samorodnitsky and Trevisan 2006), the
behavior of linearity tests on quadratic functions. However, instead of
analyzing the Gowers Norm of certain functions, we provide a more direct
combinatorial proof, studying the behavior of linearity tests on random
quadratic functions..
- …