1 research outputs found
Testing Conditional Independence of Discrete Distributions
We study the problem of testing \emph{conditional independence} for discrete
distributions. Specifically, given samples from a discrete random variable on domain , we want to distinguish,
with probability at least , between the case that and are
conditionally independent given from the case that is
-far, in -distance, from every distribution that has this
property. Conditional independence is a concept of central importance in
probability and statistics with a range of applications in various scientific
domains. As such, the statistical task of testing conditional independence has
been extensively studied in various forms within the statistics and
econometrics communities for nearly a century. Perhaps surprisingly, this
problem has not been previously considered in the framework of distribution
property testing and in particular no tester with sublinear sample complexity
is known, even for the important special case that the domains of and
are binary.
The main algorithmic result of this work is the first conditional
independence tester with {\em sublinear} sample complexity for discrete
distributions over . To complement our upper
bounds, we prove information-theoretic lower bounds establishing that the
sample complexity of our algorithm is optimal, up to constant factors, for a
number of settings. Specifically, for the prototypical setting when , we show that the sample complexity of testing conditional
independence (upper bound and matching lower bound) is
\[
\Theta\left({\max\left(n^{1/2}/\epsilon^2,\min\left(n^{7/8}/\epsilon,n^{6/7}/\epsilon^{8/7}\right)\right)}\right)\,.
\