Search CORE

142 research outputs found

Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas

Author: Feldman Vitaly
Vondrak Jan
Publication venue
Publication date: 30/03/2015
Field of study

We investigate the approximability of several classes of real-valued functions by functions of a small number of variables ({\em juntas}). Our main results are tight bounds on the number of variables required to approximate a function

f:\{0,1\}^n \rightarrow [0,1]

within

\ell_2

-error

\epsilon

over the uniform distribution: 1. If

f

is submodular, then it is

\epsilon

-close to a function of

O(\frac{1}{\epsilon^2} \log \frac{1}{\epsilon})

variables. This is an exponential improvement over previously known results. We note that

\Omega(\frac{1}{\epsilon^2})

variables are necessary even for linear functions. 2. If

f

is fractionally subadditive (XOS) it is

\epsilon

-close to a function of

2^{O(1/\epsilon^2)}

variables. This result holds for all functions with low total

\ell_1

-influence and is a real-valued analogue of Friedgut's theorem for boolean functions. We show that

2^{\Omega(1/\epsilon)}

variables are necessary even for XOS functions. As applications of these results, we provide learning algorithms over the uniform distribution. For XOS functions, we give a PAC learning algorithm that runs in time

2^{poly(1/\epsilon)} poly(n)

. For submodular functions we give an algorithm in the more demanding PMAC learning model (Balcan and Harvey, 2011) which requires a multiplicative

1+\gamma

factor approximation with probability at least

1-\epsilon

over the target distribution. Our uniform distribution algorithm runs in time

2^{poly(1/(\gamma\epsilon))} poly(n)

. This is the first algorithm in the PMAC model that over the uniform distribution can achieve a constant approximation factor arbitrarily close to 1 for all submodular functions. As follows from the lower bounds in (Feldman et al., 2013) both of these algorithms are close to optimal. We also give applications for proper learning, testing and agnostic learning with value queries of these classes.Comment: Extended abstract appears in proceedings of FOCS 201

arXiv.org e-Print Archive

Crossref

Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Author: Blanc Guy
Lange Jane
Tan Li-Yang
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 17/11/2019
Field of study

Consider the following heuristic for building a decision tree for a function

f : \{0,1\}^n \to \{\pm 1\}

. Place the most influential variable

x_i

f

at the root, and recurse on the subfunctions

f_{x_i=0}

and

f_{x_i=1}

on the left and right subtrees respectively; terminate once the tree is an

\varepsilon

-approximation of

f

. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds:

\circ

Upper bound: For every

f

with decision tree size

s

and every

\varepsilon \in (0,\frac1{2})

, this heuristic builds a decision tree of size at most

s^{O(\log(s/\varepsilon)\log(1/\varepsilon))}

\circ

Lower bound: For every

\varepsilon \in (0,\frac1{2})

and

s \le 2^{\tilde{O}(\sqrt{n})}

, there is an

f

with decision tree size

s

such that this heuristic builds a decision tree of size

s^{\tilde{\Omega}(\log s)}

. We also obtain upper and lower bounds for monotone functions:

s^{O(\sqrt{\log s}/\varepsilon)}

and

s^{\tilde{\Omega}(\sqrt[4]{\log s } )}

respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms---which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989). Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximate resilience, monotonicity, and the complexity of agnostic learning

Author: Dachman-Soled Dana
Feldman Vitaly
Tan Li-Yang
Wan Andrew
Wimmer Karl
Publication venue
Publication date: 09/07/2014
Field of study

A function

f

d

-resilient if all its Fourier coefficients of degree at most

d

are zero, i.e.,

f

is uncorrelated with all low-degree parities. We study the notion of

\mathit{approximate}

\mathit{resilience}

of Boolean functions, where we say that

f

\alpha

-approximately

d

-resilient if

f

\alpha

-close to a

[-1,1]

-valued

d

-resilient function in

\ell_1

distance. We show that approximate resilience essentially characterizes the complexity of agnostic learning of a concept class

C

over the uniform distribution. Roughly speaking, if all functions in a class

C

are far from being

d

-resilient then

C

can be learned agnostically in time

n^{O(d)}

and conversely, if

C

contains a function close to being

d

-resilient then agnostic learning of

C

in the statistical query (SQ) framework of Kearns has complexity of at least

n^{\Omega(d)}

. This characterization is based on the duality between

\ell_1

approximation by degree-

d

polynomials and approximate

d

-resilience that we establish. In particular, it implies that

\ell_1

approximation by low-degree polynomials, known to be sufficient for agnostic learning over product distributions, is in fact necessary. Focusing on monotone Boolean functions, we exhibit the existence of near-optimal

\alpha

-approximately

\widetilde{\Omega}(\alpha\sqrt{n})

-resilient monotone functions for all

\alpha>0

. Prior to our work, it was conceivable even that every monotone function is

\Omega(1)

-far from any

1

-resilient function. Furthermore, we construct simple, explicit monotone functions based on

{\sf Tribes}

and

{\sf CycleRun}

that are close to highly resilient functions. Our constructions are based on a fairly general resilience analysis and amplification. These structural results, together with the characterization, imply nearly optimal lower bounds for agnostic learning of monotone juntas

arXiv.org e-Print Archive

CiteSeerX

Crossref

DNF Sparsification and a Faster Deterministic Counting Algorithm

Author: Gopala Parikshit
Meka Raghu
Reingold Omer
Publication venue
Publication date: 01/01/2012
Field of study

Given a DNF formula on n variables, the two natural size measures are the number of terms or size s(f), and the maximum width of a term w(f). It is folklore that short DNF formulas can be made narrow. We prove a converse, showing that narrow formulas can be sparsified. More precisely, any width w DNF irrespective of its size can be

\epsilon

-approximated by a width

w

DNF with at most

(w\log(1/\epsilon))^{O(w)}

terms. We combine our sparsification result with the work of Luby and Velikovic to give a faster deterministic algorithm for approximately counting the number of satisfying solutions to a DNF. Given a formula on n variables with poly(n) terms, we give a deterministic

n^{\tilde{O}(\log \log(n))}

time algorithm that computes an additive

\epsilon

approximation to the fraction of satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best result due to Luby and Velickovic from nearly two decades ago had a run-time of

n^{\exp(O(\sqrt{\log \log n}))}

.Comment: To appear in the IEEE Conference on Computational Complexity, 201

arXiv.org e-Print Archive

CiteSeerX

Threshold Phenomena and Influence

Author: Gil Kalai
Muli Safra
Publication venue
Publication date
Field of study

Research Papers in Economics