13,514 research outputs found

    Almost Optimal Distribution-Free Junta Testing

    Get PDF
    We consider the problem of testing whether an unknown n-variable Boolean function is a k-junta in the distribution-free property testing model, where the distance between functions is measured with respect to an arbitrary and unknown probability distribution over {0,1}^n. Chen, Liu, Servedio, Sheng and Xie [Zhengyang Liu et al., 2018] showed that the distribution-free k-junta testing can be performed, with one-sided error, by an adaptive algorithm that makes O~(k^2)/epsilon queries. In this paper, we give a simple two-sided error adaptive algorithm that makes O~(k/epsilon) queries

    Near-Optimal Algorithm for Distribution-Free Junta Testing

    Full text link
    We present an adaptive algorithm with one-sided error for the problem of junta testing for Boolean function under the challenging distribution-free setting, the query complexity of which is O~(k)/ϵ\tilde O(k)/\epsilon. This improves the upper bound of O~(k2)/ϵ\tilde O(k^2)/\epsilon by \cite{liu2019distribution}. From the Ω(klogk)\Omega(k\log k) lower bound for junta testing under the uniform distribution by \cite{sauglam2018near}, our algorithm is nearly optimal. In the standard uniform distribution, the optimal junta testing algorithm is mainly designed by bridging between relevant variables and relevant blocks. At the heart of the analysis is the Efron-Stein orthogonal decomposition. However, it is not clear how to generalize this tool to the general setting. Surprisingly, we find that junta could be tested in a very simple and efficient way even in the distribution-free setting. It is interesting that the analysis does not rely on Fourier tools directly which are commonly used in junta testing. Further, we present a simpler algorithm with the same query complexity

    Property Testing of Boolean Function

    Get PDF
    The field of property testing has been studied for decades, and Boolean functions are among the most classical subjects to study in this area. In this thesis we consider the property testing of Boolean functions: distinguishing whether an unknown Boolean function has some certain property (or equivalently, belongs to a certain class of functions), or is far from having this property. We study this problem under both the standard setting, where the distance between functions is measured with respect to the uniform distribution, as well as the distribution-free setting, where the distance is measured with respect to a fixed but unknown distribution. We obtain both new upper bounds and lower bounds for the query complexity of testing various properties of Boolean functions: - Under the standard model of property testing, we prove a lower bound of \Omega(n^{1/3}) for the query complexity of any adaptive algorithm that tests whether an n-variable Boolean function is monotone, improving the previous best lower bound of \Omega(n^{1/4}) by Belov and Blais in 2015. We also prove a lower bound of \Omega(n^{2/3}) for adaptive algorithms, and a lower bound of \Omega(n) for non-adaptive algorithms with one-sided errors that test unateness, a natural generalization of monotonicity. The latter lower bound matches the previous upper bound proved by Chakrabarty and Seshadhri in 2016, up to poly-logarithmic factors of n. - We also study the distribution-free testing of k-juntas, where a function is a k-junta if it depends on at most k out of its n input variables. The standard property testing of k-juntas under the uniform distribution has been well understood: it has been shown that, for adaptive testing of k-juntas the optimal query complexity is \Theta(k); and for non-adaptive testing of k-juntas it is \Theta(k^{3/2}). Both bounds are tight up to poly-logarithmic factors of k. However, this problem is far from clear under the more general setting of distribution-free testing. Previous results only imply an O(2^k)-query algorithm for distribution-free testing of k-juntas, and besides lower bounds under the uniform distribution setting that naturally extend to this more general setting, no other results were known from the lower bound side. We significantly improve these results with an O(k^2)-query adaptive distribution-free tester for k-juntas, as well as an exponential lower bound of \Omega(2^{k/3}) for the query complexity of non-adaptive distribution-free testers for this problem. These results illustrate the hardness of distribution-free testing and also the significant role of adaptivity under this setting. - In the end we also study distribution-free testing of other basic Boolean functions. Under the distribution-free setting, a lower bound of \Omega(n^{1/5}) was proved for testing of conjunctions, decision lists, and linear threshold functions by Glasner and Servedio in 2009, and an O(n^{1/3})-query algorithm for testing monotone conjunctions was shown by Dolev and Ron in 2011. Building on techniques developed in these two papers, we improve these lower bounds to \Omega(n^{1/3}), and specifically for the class of conjunctions we present an adaptive algorithm with query complexity O(n^{1/3}). Our lower and upper bounds are tight for testing conjunctions, up to poly-logarithmic factors of n

    Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas

    Full text link
    We investigate the approximability of several classes of real-valued functions by functions of a small number of variables ({\em juntas}). Our main results are tight bounds on the number of variables required to approximate a function f:{0,1}n[0,1]f:\{0,1\}^n \rightarrow [0,1] within 2\ell_2-error ϵ\epsilon over the uniform distribution: 1. If ff is submodular, then it is ϵ\epsilon-close to a function of O(1ϵ2log1ϵ)O(\frac{1}{\epsilon^2} \log \frac{1}{\epsilon}) variables. This is an exponential improvement over previously known results. We note that Ω(1ϵ2)\Omega(\frac{1}{\epsilon^2}) variables are necessary even for linear functions. 2. If ff is fractionally subadditive (XOS) it is ϵ\epsilon-close to a function of 2O(1/ϵ2)2^{O(1/\epsilon^2)} variables. This result holds for all functions with low total 1\ell_1-influence and is a real-valued analogue of Friedgut's theorem for boolean functions. We show that 2Ω(1/ϵ)2^{\Omega(1/\epsilon)} variables are necessary even for XOS functions. As applications of these results, we provide learning algorithms over the uniform distribution. For XOS functions, we give a PAC learning algorithm that runs in time 2poly(1/ϵ)poly(n)2^{poly(1/\epsilon)} poly(n). For submodular functions we give an algorithm in the more demanding PMAC learning model (Balcan and Harvey, 2011) which requires a multiplicative 1+γ1+\gamma factor approximation with probability at least 1ϵ1-\epsilon over the target distribution. Our uniform distribution algorithm runs in time 2poly(1/(γϵ))poly(n)2^{poly(1/(\gamma\epsilon))} poly(n). This is the first algorithm in the PMAC model that over the uniform distribution can achieve a constant approximation factor arbitrarily close to 1 for all submodular functions. As follows from the lower bounds in (Feldman et al., 2013) both of these algorithms are close to optimal. We also give applications for proper learning, testing and agnostic learning with value queries of these classes.Comment: Extended abstract appears in proceedings of FOCS 201

    Learning and Testing Variable Partitions

    Get PDF
    Let FF be a multivariate function from a product set Σn\Sigma^n to an Abelian group GG. A kk-partition of FF with cost δ\delta is a partition of the set of variables V\mathbf{V} into kk non-empty subsets (X1,,Xk)(\mathbf{X}_1, \dots, \mathbf{X}_k) such that F(V)F(\mathbf{V}) is δ\delta-close to F1(X1)++Fk(Xk)F_1(\mathbf{X}_1)+\dots+F_k(\mathbf{X}_k) for some F1,,FkF_1, \dots, F_k with respect to a given error metric. We study algorithms for agnostically learning kk partitions and testing kk-partitionability over various groups and error metrics given query access to FF. In particular we show that 1.1. Given a function that has a kk-partition of cost δ\delta, a partition of cost O(kn2)(δ+ϵ)\mathcal{O}(k n^2)(\delta + \epsilon) can be learned in time O~(n2poly(1/ϵ))\tilde{\mathcal{O}}(n^2 \mathrm{poly} (1/\epsilon)) for any ϵ>0\epsilon > 0. In contrast, for k=2k = 2 and n=3n = 3 learning a partition of cost δ+ϵ\delta + \epsilon is NP-hard. 2.2. When FF is real-valued and the error metric is the 2-norm, a 2-partition of cost δ2+ϵ\sqrt{\delta^2 + \epsilon} can be learned in time O~(n5/ϵ2)\tilde{\mathcal{O}}(n^5/\epsilon^2). 3.3. When FF is Zq\mathbb{Z}_q-valued and the error metric is Hamming weight, kk-partitionability is testable with one-sided error and O(kn3/ϵ)\mathcal{O}(kn^3/\epsilon) non-adaptive queries. We also show that even two-sided testers require Ω(n)\Omega(n) queries when k=2k = 2. This work was motivated by reinforcement learning control tasks in which the set of control variables can be partitioned. The partitioning reduces the task into multiple lower-dimensional ones that are relatively easier to learn. Our second algorithm empirically increases the scores attained over previous heuristic partitioning methods applied in this context.Comment: Innovations in Theoretical Computer Science (ITCS) 202
    corecore