Search CORE

13,514 research outputs found

Almost Optimal Distribution-Free Junta Testing

Author: Bshouty Nader H.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Computational Complexity Conference (CCC 2019)
Publication date: 01/01/2019
Field of study

We consider the problem of testing whether an unknown n-variable Boolean function is a k-junta in the distribution-free property testing model, where the distance between functions is measured with respect to an arbitrary and unknown probability distribution over {0,1}^n. Chen, Liu, Servedio, Sheng and Xie [Zhengyang Liu et al., 2018] showed that the distribution-free k-junta testing can be performed, with one-sided error, by an adaptive algorithm that makes O~(k^2)/epsilon queries. In this paper, we give a simple two-sided error adaptive algorithm that makes O~(k/epsilon) queries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Near-Optimal Algorithm for Distribution-Free Junta Testing

Author: Zhang Xiaojin
Publication venue
Publication date: 13/07/2021
Field of study

We present an adaptive algorithm with one-sided error for the problem of junta testing for Boolean function under the challenging distribution-free setting, the query complexity of which is

\tilde O(k)/\epsilon

. This improves the upper bound of

\tilde O(k^2)/\epsilon

by \cite{liu2019distribution}. From the

\Omega(k\log k)

lower bound for junta testing under the uniform distribution by \cite{sauglam2018near}, our algorithm is nearly optimal. In the standard uniform distribution, the optimal junta testing algorithm is mainly designed by bridging between relevant variables and relevant blocks. At the heart of the analysis is the Efron-Stein orthogonal decomposition. However, it is not clear how to generalize this tool to the general setting. Surprisingly, we find that junta could be tested in a very simple and efficient way even in the distribution-free setting. It is interesting that the analysis does not rely on Fourier tools directly which are commonly used in junta testing. Further, we present a simpler algorithm with the same query complexity

arXiv.org e-Print Archive

Property Testing of Boolean Function

Author: Xie Jinyu
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

The field of property testing has been studied for decades, and Boolean functions are among the most classical subjects to study in this area. In this thesis we consider the property testing of Boolean functions: distinguishing whether an unknown Boolean function has some certain property (or equivalently, belongs to a certain class of functions), or is far from having this property. We study this problem under both the standard setting, where the distance between functions is measured with respect to the uniform distribution, as well as the distribution-free setting, where the distance is measured with respect to a fixed but unknown distribution. We obtain both new upper bounds and lower bounds for the query complexity of testing various properties of Boolean functions: - Under the standard model of property testing, we prove a lower bound of \Omega(n^{1/3}) for the query complexity of any adaptive algorithm that tests whether an n-variable Boolean function is monotone, improving the previous best lower bound of \Omega(n^{1/4}) by Belov and Blais in 2015. We also prove a lower bound of \Omega(n^{2/3}) for adaptive algorithms, and a lower bound of \Omega(n) for non-adaptive algorithms with one-sided errors that test unateness, a natural generalization of monotonicity. The latter lower bound matches the previous upper bound proved by Chakrabarty and Seshadhri in 2016, up to poly-logarithmic factors of n. - We also study the distribution-free testing of k-juntas, where a function is a k-junta if it depends on at most k out of its n input variables. The standard property testing of k-juntas under the uniform distribution has been well understood: it has been shown that, for adaptive testing of k-juntas the optimal query complexity is \Theta(k); and for non-adaptive testing of k-juntas it is \Theta(k^{3/2}). Both bounds are tight up to poly-logarithmic factors of k. However, this problem is far from clear under the more general setting of distribution-free testing. Previous results only imply an O(2^k)-query algorithm for distribution-free testing of k-juntas, and besides lower bounds under the uniform distribution setting that naturally extend to this more general setting, no other results were known from the lower bound side. We significantly improve these results with an O(k^2)-query adaptive distribution-free tester for k-juntas, as well as an exponential lower bound of \Omega(2^{k/3}) for the query complexity of non-adaptive distribution-free testers for this problem. These results illustrate the hardness of distribution-free testing and also the significant role of adaptivity under this setting. - In the end we also study distribution-free testing of other basic Boolean functions. Under the distribution-free setting, a lower bound of \Omega(n^{1/5}) was proved for testing of conjunctions, decision lists, and linear threshold functions by Glasner and Servedio in 2009, and an O(n^{1/3})-query algorithm for testing monotone conjunctions was shown by Dolev and Ron in 2011. Building on techniques developed in these two papers, we improve these lower bounds to \Omega(n^{1/3}), and specifically for the class of conjunctions we present an adaptive algorithm with query complexity O(n^{1/3}). Our lower and upper bounds are tight for testing conjunctions, up to poly-logarithmic factors of n

Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas

Author: Feldman Vitaly
Vondrak Jan
Publication venue
Publication date: 30/03/2015
Field of study

We investigate the approximability of several classes of real-valued functions by functions of a small number of variables ({\em juntas}). Our main results are tight bounds on the number of variables required to approximate a function

f:\{0,1\}^n \rightarrow [0,1]

within

\ell_2

-error

\epsilon

over the uniform distribution: 1. If

f

is submodular, then it is

\epsilon

-close to a function of

O(\frac{1}{\epsilon^2} \log \frac{1}{\epsilon})

variables. This is an exponential improvement over previously known results. We note that

\Omega(\frac{1}{\epsilon^2})

variables are necessary even for linear functions. 2. If

f

is fractionally subadditive (XOS) it is

\epsilon

-close to a function of

2^{O(1/\epsilon^2)}

variables. This result holds for all functions with low total

\ell_1

-influence and is a real-valued analogue of Friedgut's theorem for boolean functions. We show that

2^{\Omega(1/\epsilon)}

variables are necessary even for XOS functions. As applications of these results, we provide learning algorithms over the uniform distribution. For XOS functions, we give a PAC learning algorithm that runs in time

2^{poly(1/\epsilon)} poly(n)

. For submodular functions we give an algorithm in the more demanding PMAC learning model (Balcan and Harvey, 2011) which requires a multiplicative

1+\gamma

factor approximation with probability at least

1-\epsilon

over the target distribution. Our uniform distribution algorithm runs in time

2^{poly(1/(\gamma\epsilon))} poly(n)

. This is the first algorithm in the PMAC model that over the uniform distribution can achieve a constant approximation factor arbitrarily close to 1 for all submodular functions. As follows from the lower bounds in (Feldman et al., 2013) both of these algorithms are close to optimal. We also give applications for proper learning, testing and agnostic learning with value queries of these classes.Comment: Extended abstract appears in proceedings of FOCS 201

arXiv.org e-Print Archive

Learning and Testing Variable Partitions

Author: Bogdanov Andrej
Wang Baoxiang
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 01/01/2020
Field of study

Let

F

be a multivariate function from a product set

\Sigma^n

to an Abelian group

G

. A

k

-partition of

F

with cost

\delta

is a partition of the set of variables

\mathbf{V}

into

k

non-empty subsets

(\mathbf{X}_1, \dots, \mathbf{X}_k)

such that

F(\mathbf{V})

\delta

-close to

F_1(\mathbf{X}_1)+\dots+F_k(\mathbf{X}_k)

for some

F_1, \dots, F_k

with respect to a given error metric. We study algorithms for agnostically learning

k

partitions and testing

k

-partitionability over various groups and error metrics given query access to

F

. In particular we show that

1.

Given a function that has a

k

-partition of cost

\delta

, a partition of cost

\mathcal{O}(k n^2)(\delta + \epsilon)

can be learned in time

\tilde{\mathcal{O}}(n^2 \mathrm{poly} (1/\epsilon))

for any

\epsilon > 0

. In contrast, for

k = 2

and

n = 3

learning a partition of cost

\delta + \epsilon

is NP-hard.

2.

When

F

is real-valued and the error metric is the 2-norm, a 2-partition of cost

\sqrt{\delta^2 + \epsilon}

can be learned in time

\tilde{\mathcal{O}}(n^5/\epsilon^2)

3.

When

F

\mathbb{Z}_q

-valued and the error metric is Hamming weight,

k

-partitionability is testable with one-sided error and

\mathcal{O}(kn^3/\epsilon)

non-adaptive queries. We also show that even two-sided testers require

\Omega(n)

queries when

k = 2

. This work was motivated by reinforcement learning control tasks in which the set of control variables can be partitioned. The partitioning reduces the task into multiple lower-dimensional ones that are relatively easier to learn. Our second algorithm empirically increases the scores attained over previous heuristic partitioning methods applied in this context.Comment: Innovations in Theoretical Computer Science (ITCS) 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server