Search CORE

327 research outputs found

Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms

Author: Khardon R.
Roth D.
Servedio R. A.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

The paper studies machine learning problems where each example is described using a set of Boolean features and where hypotheses are represented by linear threshold elements. One method of increasing the expressiveness of learned hypotheses in this context is to expand the feature set to include conjunctions of basic features. This can be done explicitly or where possible by using a kernel function. Focusing on the well known Perceptron and Winnow algorithms, the paper demonstrates a tradeoff between the computational efficiency with which the algorithm can be run over the expanded feature space and the generalization ability of the corresponding learning algorithm. We first describe several kernel functions which capture either limited forms of conjunctions or all conjunctions. We show that these kernels can be used to efficiently run the Perceptron algorithm over a feature space of exponentially many conjunctions; however we also show that using such kernels, the Perceptron algorithm can provably make an exponential number of mistakes even when learning simple functions. We then consider the question of whether kernel functions can analogously be used to run the multiplicative-update Winnow algorithm over an expanded feature space of exponentially many conjunctions. Known upper bounds imply that the Winnow algorithm can learn Disjunctive Normal Form (DNF) formulae with a polynomial mistake bound in this setting. However, we prove that it is computationally hard to simulate Winnows behavior for learning DNF over such a feature set. This implies that the kernel functions which correspond to running Winnow for this problem are not efficiently computable, and that there is no general construction that can run Winnow with kernels

arXiv.org e-Print Archive

Crossref

Tight Bounds on Proper Equivalence Query Learning of DNF

Author: Hellerstein Lisa
Kletenik Devorah
Sellie Linda
Servedio Rocco
Publication venue
Publication date: 01/01/2011
Field of study

We prove a new structural lemma for partial Boolean functions

f

, which we call the seed lemma for DNF. Using the lemma, we give the first subexponential algorithm for proper learning of DNF in Angluin's Equivalence Query (EQ) model. The algorithm has time and query complexity

2^{(\tilde{O}{\sqrt{n}})}

, which is optimal. We also give a new result on certificates for DNF-size, a simple algorithm for properly PAC-learning DNF, and new results on EQ-learning

\log n

-term DNF and decision trees

arXiv.org e-Print Archive

CiteSeerX

From average case complexity to improper learning complexity

Author: Berthet Q.
Daniely A.
Feige U.
Vapnik V. N.
Publication venue
Publication date: 01/01/2014
Field of study

The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms. The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a.k.a. representation independent learning).The difficulty in proving lower bounds for improper learning is that the standard reductions from

\mathbf{NP}

-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in (Kearns and Valiant 89) and relies on cryptographic assumptions. We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption (Feige 02) about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular, 1. Learning

\mathrm{DNF}

's is hard. 2. Agnostically learning halfspaces with a constant approximation ratio is hard. 3. Learning an intersection of

\omega(1)

halfspaces is hard.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Consistency dimension and distribution-dependent learning from queries

Author: Balcázar Navarro José Luis
Castro Rabal Jorge
Guijarro Guillem David
Publication venue
Publication date: 01/01/2000
Field of study

We prove a new combinatorial characterization of polynomial learnability from equivalence queries, and state some of its consequences relating the learnability of a class with the learnability via equivalence and membership queries of its subclasses obtained by restricting the instance space. Then we propose and study two models of query learning in which there is a probability distribution on the instance space, both as an application of the tools developed from the combinatorial characterization and as models of independent interest.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Quantum machine learning: a classical perspective

Author: Ben-David S
Bishop CM
Bottou L
Breuer H-P
Chiang C-F
Getoor L
Golub GH
Grötschel M
Khardon R
Lanckriet GR
Li S
Messiah A
Murphy KP
Papadimitriou CH
Rasmussen CE
Vapnik VN
Publication venue: 'The Royal Society'
Publication date: 01/01/2018
Field of study

Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

UCL Discovery

MPG.PuRe

Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Author: Blanc Guy
Lange Jane
Tan Li-Yang
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 17/11/2019
Field of study

Consider the following heuristic for building a decision tree for a function

f : \{0,1\}^n \to \{\pm 1\}

. Place the most influential variable

x_i

f

at the root, and recurse on the subfunctions

f_{x_i=0}

and

f_{x_i=1}

on the left and right subtrees respectively; terminate once the tree is an

\varepsilon

-approximation of

f

. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds:

\circ

Upper bound: For every

f

with decision tree size

s

and every

\varepsilon \in (0,\frac1{2})

, this heuristic builds a decision tree of size at most

s^{O(\log(s/\varepsilon)\log(1/\varepsilon))}

\circ

Lower bound: For every

\varepsilon \in (0,\frac1{2})

and

s \le 2^{\tilde{O}(\sqrt{n})}

, there is an

f

with decision tree size

s

such that this heuristic builds a decision tree of size

s^{\tilde{\Omega}(\log s)}

. We also obtain upper and lower bounds for monotone functions:

s^{O(\sqrt{\log s}/\varepsilon)}

and

s^{\tilde{\Omega}(\sqrt[4]{\log s } )}

respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms---which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989). Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Decision lists and related Boolean functions

Author: Eiter Thomas
Ibaraki Toshihide
Makino Kazuhisa
Publication venue: Elsevier Science B.V.
Publication date: 06/01/2002
Field of study

AbstractWe consider Boolean functions represented by decision lists, and study their relationships to other classes of Boolean functions. It turns out that the elementary class of 1-decision lists has interesting relationships to independently defined classes such as disguised Horn functions, read-once functions, nested differences of concepts, threshold functions, and 2-monotonic functions. In particular, 1-decision lists coincide with fragments of the mentioned classes. We further investigate the recognition problem for this class, as well as the extension problem in the context of partially defined Boolean functions (pdBfs). We show that finding an extension of a given pdBf in the class of 1-decision lists is possible in linear time. This improves on previous results. Moreover, we present an algorithm for enumerating all such extensions with polynomial delay

Elsevier - Publisher Connector