Search CORE

18 research outputs found

A Formal Proof of PAC Learnability for Decision Stumps

Author: Afeldt Reynald
Avigad Jeremy
Bagnall Alexander
Bansal Kshitij
Barthe Gilles
Bentkamp Alexander
Bidlingmaier Martin E.
Blanchet Bruno
Blumer Anselm
Cohen Nadav
Community The
Dudley R. M.
Eberl Manuel
Giry Michèle
Gopinathan Kiran
Huang Daniel
Hölzl Johannes
Jakubuv Jan
Jones Simon Peyton
Kaliszyk Cezary
Kaliszyk Cezary
Kearns Michael J
Mohri Mehryar
Petcher Adam
Rivest Ronald L.
Selsam Daniel
Selsam Daniel
Shalev-Shwartz Shai
Tassarotti Joseph
van der Weegen Eelis
Vapnik Vladimir Naumovich
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/01/2021
Field of study

We present a formal proof in Lean of probably approximately correct (PAC) learnability of the concept class of decision stumps. This classic result in machine learning theory derives a bound on error probabilities for a simple type of classifier. Though such a proof appears simple on paper, analytic and measure-theoretic subtleties arise when carrying it out fully formally. Our proof is structured so as to separate reasoning about deterministic properties of a learning function from proofs of measurability and analysis of probabilities.Comment: 13 pages, appeared in Certified Programs and Proofs (CPP) 202

arXiv.org e-Print Archive

Crossref

Overview of AdaBoost : Reconciling its views to better understand its dynamics

Author: Beja-Battais Perceval
Publication venue
Publication date: 06/10/2023
Field of study

Boosting methods have been introduced in the late 1980's. They were born following the theoritical aspect of PAC learning. The main idea of boosting methods is to combine weak learners to obtain a strong learner. The weak learners are obtained iteratively by an heuristic which tries to correct the mistakes of the previous weak learner. In 1995, Freund and Schapire [18] introduced AdaBoost, a boosting algorithm that is still widely used today. Since then, many views of the algorithm have been proposed to properly tame its dynamics. In this paper, we will try to cover all the views that one can have on AdaBoost. We will start with the original view of Freund and Schapire before covering the different views and unify them with the same formalism. We hope this paper will help the non-expert reader to better understand the dynamics of AdaBoost and how the different views are equivalent and related to each other

arXiv.org e-Print Archive

Quantum Boosting using Domain-Partitioning Hypotheses

Author: Bera Debajyoti
Bhatia Rohan
Chani Parmeet
Chatterjee Sagnik
Publication venue
Publication date: 28/02/2022
Field of study

Boosting is an ensemble learning method that converts a weak learner into a strong learner in the PAC learning framework. Freund and Schapire gave the first classical boosting algorithm for binary hypothesis known as AdaBoost, and this was recently adapted into a quantum boosting algorithm by Arunachalam et al. Their quantum boosting algorithm (which we refer to as Q-AdaBoost) is quadratically faster than the classical version in terms of the VC-dimension of the hypothesis class of the weak learner but polynomially worse in the bias of the weak learner. In this work we design a different quantum boosting algorithm that uses domain partitioning hypotheses that are significantly more flexible than those used in prior quantum boosting algorithms in terms of margin calculations. Our algorithm Q-RealBoost is inspired by the "Real AdaBoost" (aka. RealBoost) extension to the original AdaBoost algorithm. Further, we show that Q-RealBoost provides a polynomial speedup over Q-AdaBoost in terms of both the bias of the weak learner and the time taken by the weak learner to learn the target concept class.Comment: 24 pages, 3 figures, 1 tabl

arXiv.org e-Print Archive

Machine Learning Techniques Applied to Telecommunication Data

Author: Cazzato Stefania
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 15/04/2019
Field of study

In attesa di ABSTRACT- Tecniche di Machine Learning Applicate a Dati di Telecomunicazion

Archivio istituzionale della ricerca - Università di Genova

Boosting Boosting

Author: Appel Ron
Publication venue
Publication date: 01/01/2017
Field of study

Machine learning is becoming prevalent in all aspects of our lives. For some applications, there is a need for simple but accurate white-box systems that are able to train efficiently and with little data. "Boosting" is an intuitive method, combining many simple (possibly inaccurate) predictors to form a powerful, accurate classifier. Boosted classifiers are intuitive, easy to use, and exhibit the fastest speeds at test-time when implemented as a cascade. However, they have a few drawbacks: training decision trees is a relatively slow procedure, and from a theoretical standpoint, no simple unified framework for cost-sensitive multi-class boosting exists. Furthermore, (axis-aligned) decision trees may be inadequate in some situations, thereby stalling training; and even in cases where they are sufficiently useful, they don't capture the intrinsic nature of the data, as they tend to form boundaries that overfit. My thesis focuses on remedying these three drawbacks of boosting. Ch.III outlines a method (called QuickBoost) that trains identical classifiers at an order of magnitude faster than before, based on a proof of a bound. In Ch.IV, a unified framework for cost-sensitive multi-class boosting (called REBEL) is proposed, both advancing theory and demonstrating empirical gains. Finally, Ch.V describes a novel family of weak learners (called Localized Similarities) that guarantee theoretical bounds and outperform decision trees and Neural Nets (as well as several other commonly used classification methods) on a range of datasets. The culmination of my work is an easy-to-use, fast-training, cost-sensitive multi-class boosting framework whose functionality is interpretable (since each weak learner is a simple comparison of similarity), and whose performance is better than Neural Networks and other competing methods. It is the tool that everyone should have in their toolbox and the first one they try.</p

Caltech Theses and Dissertations

An efficient boosting algorithm for combining preferences

Author: Iyer Raj Dharmarajan, 1976-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1999
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 79-84).by Raj Dharmarajan Iyer, Jr.S.M

CiteSeerX

DSpace@MIT