Search CORE

167 research outputs found

Learning using Local Membership Queries

Author: Awasthi Pranjal
Feldman Vitaly
Kanade Varun
Publication venue
Publication date: 17/04/2013
Field of study

We introduce a new model of membership query (MQ) learning, where the learning algorithm is restricted to query points that are \emph{close} to random examples drawn from the underlying distribution. The learning model is intermediate between the PAC model (Valiant, 1984) and the PAC+MQ model (where the queries are allowed to be arbitrary points). Membership query algorithms are not popular among machine learning practitioners. Apart from the obvious difficulty of adaptively querying labelers, it has also been observed that querying \emph{unnatural} points leads to increased noise from human labelers (Lang and Baum, 1992). This motivates our study of learning algorithms that make queries that are close to examples generated from the data distribution. We restrict our attention to functions defined on the

n

-dimensional Boolean hypercube and say that a membership query is local if its Hamming distance from some example in the (random) training data is at most

O(\log(n))

. We show the following results in this model: (i) The class of sparse polynomials (with coefficients in R) over

\{0,1\}^n

is polynomial time learnable under a large class of \emph{locally smooth} distributions using

O(\log(n))

-local queries. This class also includes the class of

O(\log(n))

-depth decision trees. (ii) The class of polynomial-sized decision trees is polynomial time learnable under product distributions using

O(\log(n))

-local queries. (iii) The class of polynomial size DNF formulas is learnable under the uniform distribution using

O(\log(n))

-local queries in time

n^{O(\log(\log(n)))}

. (iv) In addition we prove a number of results relating the proposed model to the traditional PAC model and the PAC+MQ model

arXiv.org e-Print Archive

CiteSeerX

Learning Coverage Functions and Private Release of Marginals

Author: Feldman Vitaly
Kothari Pravesh
Publication venue
Publication date: 27/05/2014
Field of study

We study the problem of approximating and learning coverage functions. A function

c: 2^{[n]} \rightarrow \mathbf{R}^{+}

is a coverage function, if there exists a universe

U

with non-negative weights

w(u)

for each

u \in U

and subsets

A_1, A_2, \ldots, A_n

U

such that

c(S) = \sum_{u \in \cup_{i \in S} A_i} w(u)

. Alternatively, coverage functions can be described as non-negative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any

\gamma,\delta>0

, given random and uniform examples of an unknown coverage function

c

, finds a function

h

that approximates

c

within factor

1+\gamma

on all but

\delta

-fraction of the points in time

poly(n,1/\gamma,1/\delta)

. This is the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey (2011). Our algorithms are based on several new structural properties of coverage functions. Using the results in (Feldman and Kothari, 2014), we also show that coverage functions are learnable agnostically with excess

\ell_1

-error

\epsilon

over all product and symmetric distributions in time

n^{\log(1/\epsilon)}

. In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomial-size disjoint DNF formulas, a class of functions for which the best known algorithm runs in time

2^{\tilde{O}(n^{1/3})}

(Klivans and Servedio, 2004). As an application of our learning results, we give simple differentially-private algorithms for releasing monotone conjunction counting queries with low average error. In particular, for any

k \leq n

, we obtain private release of

k

-way marginals with average error

\bar{\alpha}

in time

n^{O(\log(1/\bar{\alpha}))}

arXiv.org e-Print Archive

CiteSeerX

Learning Unions of $\omega(1)$ -Dimensional Rectangles

Author: Aizenstein
Alp Atıcı
Beimel
Bruck
Chen
Chen
Freund
Freund
Hajnal
Jackson
Khardon
Klivans
Krause
Kushilevitz
Maass
Pisier
Rocco A. Servedio
Schapire
Servedio
Publication venue: 'Elsevier BV'
Publication date: 26/06/2007
Field of study

We consider the problem of learning unions of rectangles over the domain

[b]^n

, in the uniform distribution membership query learning setting, where both b and n are "large". We obtain poly

(n, \log b)

-time algorithms for the following classes: - poly

(n \log b)

-way Majority of

O(\frac{\log(n \log b)} {\log \log(n \log b)})

-dimensional rectangles. - Union of poly

(\log(n \log b))

many

O(\frac{\log^2 (n \log b)} {(\log \log(n \log b) \log \log \log (n \log b))^2})

-dimensional rectangles. - poly

(n \log b)

-way Majority of poly

(n \log b)

-Or of disjoint

O(\frac{\log(n \log b)} {\log \log(n \log b)})

-dimensional rectangles. Our main algorithmic tool is an extension of Jackson's boosting- and Fourier-based Harmonic Sieve algorithm [Jackson 1997] to the domain

[b]^n

, building on work of [Akavia, Goldwasser, Safra 2003]. Other ingredients used to obtain the results stated above are techniques from exact learning [Beimel, Kushilevitz 1998] and ideas from recent work on learning augmented

AC^{0}

circuits [Jackson, Klivans, Servedio 2002] and on representing Boolean functions as thresholds of parities [Klivans, Servedio 2001].Comment: 25 pages. Some corrections. Recipient of E. M. Gold award ALT 2006. To appear in Journal of Theoretical Computer Scienc

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

Approximate resilience, monotonicity, and the complexity of agnostic learning

Author: Dachman-Soled Dana
Feldman Vitaly
Tan Li-Yang
Wan Andrew
Wimmer Karl
Publication venue
Publication date: 09/07/2014
Field of study

A function

f

d

-resilient if all its Fourier coefficients of degree at most

d

are zero, i.e.,

f

is uncorrelated with all low-degree parities. We study the notion of

\mathit{approximate}

\mathit{resilience}

of Boolean functions, where we say that

f

\alpha

-approximately

d

-resilient if

f

\alpha

-close to a

[-1,1]

-valued

d

-resilient function in

\ell_1

distance. We show that approximate resilience essentially characterizes the complexity of agnostic learning of a concept class

C

over the uniform distribution. Roughly speaking, if all functions in a class

C

are far from being

d

-resilient then

C

can be learned agnostically in time

n^{O(d)}

and conversely, if

C

contains a function close to being

d

-resilient then agnostic learning of

C

in the statistical query (SQ) framework of Kearns has complexity of at least

n^{\Omega(d)}

. This characterization is based on the duality between

\ell_1

approximation by degree-

d

polynomials and approximate

d

-resilience that we establish. In particular, it implies that

\ell_1

approximation by low-degree polynomials, known to be sufficient for agnostic learning over product distributions, is in fact necessary. Focusing on monotone Boolean functions, we exhibit the existence of near-optimal

\alpha

-approximately

\widetilde{\Omega}(\alpha\sqrt{n})

-resilient monotone functions for all

\alpha>0

. Prior to our work, it was conceivable even that every monotone function is

\Omega(1)

-far from any

1

-resilient function. Furthermore, we construct simple, explicit monotone functions based on

{\sf Tribes}

and

{\sf CycleRun}

that are close to highly resilient functions. Our constructions are based on a fairly general resilience analysis and amplification. These structural results, together with the characterization, imply nearly optimal lower bounds for agnostic learning of monotone juntas

arXiv.org e-Print Archive

CiteSeerX

Crossref

Efficiently Learning Monotone Decision Trees with ID3

Author: Thompson Pamela
Publication venue: Duquesne Scholarship Collection
Publication date: 01/04/2015
Field of study

Since the Probably Approximately Correct learning model was introduced in 1984, there has been much effort in designing computationally efficient algorithms for learning Boolean functions from random examples drawn from a uniform distribution. In this paper, I take the ID3 information-gain-first classification algorithm and apply it to the task of learning monotone Boolean functions from examples that are uniformly distributed over {0,1}^n. I limited my scope to the class of monotone Boolean functions that can be represented as read-2 width-2 disjunctive normal form expressions. I modeled these functions as graphs and examined each type of connected component contained in these models, i.e. path graphs and cycle graphs. I determined the influence of the variables in the pieces of these graph models in order to understand how ID3 behaves when learning these functions. My findings show that ID3 will produce an optimal decision tree for this class of Boolean functions

Duquesne University: Digital Commons