12,764 research outputs found

    On active and passive testing

    Full text link
    Given a property of Boolean functions, what is the minimum number of queries required to determine with high probability if an input function satisfies this property or is "far" from satisfying it? This is a fundamental question in Property Testing, where traditionally the testing algorithm is allowed to pick its queries among the entire set of inputs. Balcan, Blais, Blum and Yang have recently suggested to restrict the tester to take its queries from a smaller random subset of polynomial size of the inputs. This model is called active testing, and in the extreme case when the size of the set we can query from is exactly the number of queries performed it is known as passive testing. We prove that passive or active testing of k-linear functions (that is, sums of k variables among n over Z_2) requires Theta(k*log n) queries, assuming k is not too large. This extends the case k=1, (that is, dictator functions), analyzed by Balcan et. al. We also consider other classes of functions including low degree polynomials, juntas, and partially symmetric functions. Our methods combine algebraic, combinatorial, and probabilistic techniques, including the Talagrand concentration inequality and the Erdos--Rado theorem on Delta-systems.Comment: 16 page

    Testing Conditional Independence of Discrete Distributions

    Full text link
    We study the problem of testing \emph{conditional independence} for discrete distributions. Specifically, given samples from a discrete random variable (X,Y,Z)(X, Y, Z) on domain [ℓ1]×[ℓ2]×[n][\ell_1]\times[\ell_2] \times [n], we want to distinguish, with probability at least 2/32/3, between the case that XX and YY are conditionally independent given ZZ from the case that (X,Y,Z)(X, Y, Z) is ϵ\epsilon-far, in ℓ1\ell_1-distance, from every distribution that has this property. Conditional independence is a concept of central importance in probability and statistics with a range of applications in various scientific domains. As such, the statistical task of testing conditional independence has been extensively studied in various forms within the statistics and econometrics communities for nearly a century. Perhaps surprisingly, this problem has not been previously considered in the framework of distribution property testing and in particular no tester with sublinear sample complexity is known, even for the important special case that the domains of XX and YY are binary. The main algorithmic result of this work is the first conditional independence tester with {\em sublinear} sample complexity for discrete distributions over [ℓ1]×[ℓ2]×[n][\ell_1]\times[\ell_2] \times [n]. To complement our upper bounds, we prove information-theoretic lower bounds establishing that the sample complexity of our algorithm is optimal, up to constant factors, for a number of settings. Specifically, for the prototypical setting when ℓ1,ℓ2=O(1)\ell_1, \ell_2 = O(1), we show that the sample complexity of testing conditional independence (upper bound and matching lower bound) is \[ \Theta\left({\max\left(n^{1/2}/\epsilon^2,\min\left(n^{7/8}/\epsilon,n^{6/7}/\epsilon^{8/7}\right)\right)}\right)\,. \

    Low-degree tests at large distances

    Full text link
    We define tests of boolean functions which distinguish between linear (or quadratic) polynomials, and functions which are very far, in an appropriate sense, from these polynomials. The tests have optimal or nearly optimal trade-offs between soundness and the number of queries. In particular, we show that functions with small Gowers uniformity norms behave ``randomly'' with respect to hypergraph linearity tests. A central step in our analysis of quadraticity tests is the proof of an inverse theorem for the third Gowers uniformity norm of boolean functions. The last result has also a coding theory application. It is possible to estimate efficiently the distance from the second-order Reed-Muller code on inputs lying far beyond its list-decoding radius

    Quantum query complexity of entropy estimation

    Full text link
    Estimation of Shannon and R\'enyi entropies of unknown discrete distributions is a fundamental problem in statistical property testing and an active research topic in both theoretical computer science and information theory. Tight bounds on the number of samples to estimate these entropies have been established in the classical setting, while little is known about their quantum counterparts. In this paper, we give the first quantum algorithms for estimating α\alpha-R\'enyi entropies (Shannon entropy being 1-Renyi entropy). In particular, we demonstrate a quadratic quantum speedup for Shannon entropy estimation and a generic quantum speedup for α\alpha-R\'enyi entropy estimation for all α≥0\alpha\geq 0, including a tight bound for the collision-entropy (2-R\'enyi entropy). We also provide quantum upper bounds for extreme cases such as the Hartley entropy (i.e., the logarithm of the support size of a distribution, corresponding to α=0\alpha=0) and the min-entropy case (i.e., α=+∞\alpha=+\infty), as well as the Kullback-Leibler divergence between two distributions. Moreover, we complement our results with quantum lower bounds on α\alpha-R\'enyi entropy estimation for all α≥0\alpha\geq 0.Comment: 43 pages, 1 figur

    Lower bounds for adaptive linearity tests

    Get PDF
    Linearity tests are randomized algorithms which have oracle access to the truth table of some function f, and are supposed to distinguish between linear functions and functions which are far from linear. Linearity tests were first introduced by (Blum, Luby and Rubenfeld, 1993), and were later used in the PCP theorem, among other applications. The quality of a linearity test is described by its correctness c - the probability it accepts linear functions, its soundness s - the probability it accepts functions far from linear, and its query complexity q - the number of queries it makes. Linearity tests were studied in order to decrease the soundness of linearity tests, while keeping the query complexity small (for one reason, to improve PCP constructions). Samorodnitsky and Trevisan (Samorodnitsky and Trevisan 2000) constructed the Complete Graph Test, and prove that no Hyper Graph Test can perform better than the Complete Graph Test. Later in (Samorodnitsky and Trevisan 2006) they prove, among other results, that no non-adaptive linearity test can perform better than the Complete Graph Test. Their proof uses the algebraic machinery of the Gowers Norm. A result by (Ben-Sasson, Harsha and Raskhodnikova 2005) allows to generalize this lower bound also to adaptive linearity tests. We also prove the same optimal lower bound for adaptive linearity test, but our proof technique is arguably simpler and more direct than the one used in (Samorodnitsky and Trevisan 2006). We also study, like (Samorodnitsky and Trevisan 2006), the behavior of linearity tests on quadratic functions. However, instead of analyzing the Gowers Norm of certain functions, we provide a more direct combinatorial proof, studying the behavior of linearity tests on random quadratic functions..
    • …
    corecore