Search CORE

8 research outputs found

Échantillonnage progressif guidé pour stabiliser la courbe d'apprentissage

Author: Portet François
Quiniou René
Publication venue: HAL CCSD
Publication date: 22/01/2008
Field of study

National audienceL'un des enjeux de l'apprentissage artificiel est de pouvoir fonctionner avec des volumes de données toujours plus grands. Bien qu'il soit généralement admis que plus un ensemble d'apprentissage est large et plus les résultats sont performants, il existe des limites à la masse d'informations qu'un algorithme d'apprentissage peut manipuler. Pour résoudre ce problème, nous proposons d'améliorer la méthode d'échantillonnage progressif en guidant la construction d'un ensemble d'apprentissage réduit à partir d'un large ensemble de données. L'apprentissage à partir de l'ensemble réduit doit conduire à des performances similaires à l'apprentissage effectué avec l'ensemble complet. Le guidage de l'échantillonnage s'appuie sur une connaissance a priori qui accélère la convergence de l'algorithme. Cette approche présente trois avantages : 1) l'ensemble d'apprentissage réduit est composé des cas les plus représentatifs de l'ensemble complet; 2) la courbe d'apprentissage est stabilisée; 3) la détection de convergence est accélérée. L'application de cette méthode à des données classiques et à des données provenant d'unités de soins intensifs révèle qu'il est possible de réduire de façon significative un ensemble d'apprentissage sans diminuer la performance de l'apprentissage

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime

Author: Biroli Giulio
d'Ascoli Stéphane
Krzakala Florent
Refinetti Maria
Publication venue
Publication date: 01/01/2020
Field of study

Deep neural networks can achieve remarkable generalization performances while interpolating the training data perfectly. Rather than the U-curve emblematic of the bias-variance trade-off, their test error often follows a "double descent" - a mark of the beneficial role of overparametrization. In this work, we develop a quantitative theory for this phenomenon in the so-called lazy learning regime of neural networks, by considering the problem of learning a high-dimensional function with random features regression. We obtain a precise asymptotic expression for the bias-variance decomposition of the test error, and show that the bias displays a phase transition at the interpolation threshold, beyond which it remains constant. We disentangle the variances stemming from the sampling of the dataset, from the additive noise corrupting the labels, and from the initialization of the weights. Following up on Geiger et al. 2019, we first show that the latter two contributions are the crux of the double descent: they lead to the overfitting peak at the interpolation threshold and to the decay of the test error upon overparametrization. We then quantify how they are suppressed by ensemble averaging the outputs of K independently initialized estimators. When K is sent to infinity, the test error remains constant beyond the interpolation threshold. We further compare the effects of overparametrizing, ensembling and regularizing. Finally, we present numerical experiments on classic deep learning setups to show that our results hold qualitatively in realistic lazy learning scenarios.Comment: 29 pages, 12 figure

arXiv.org e-Print Archive

Hal-Diderot

Apprentissage d'arbre de décision pour le pilotage en ligne d'algorithmes de détection sur les électrocardiogrammes

Author: Carrault Guy
Cordier Marie-Odile
Portet François
Quiniou René
Publication venue: HAL CCSD
Publication date: 22/01/2008
Field of study

National audienceLe nombre d'algorithmes de traitement du signal (compression, reconnaissance des formes, etc.) grandit progressivement ce qui rend de plus en plus difficile le choix de l'algorithme le plus adapté à une tâche particulière. Ceci est particulièrement vrai pour l'analyse automatique des électrocardiogrammes (ECG) notamment pour la détection des complexes QRS. Bien que chaque algorithme de la littérature se comporte de manière satisfaisante dans des situations normales, il existe des contextes où un algorithme est plus adapté que les autres, notamment en présence de bruit. Nous proposons une méthode de sélection qui choisit, en ligne, l'algorithme le plus adapté au contexte courant du signal à traiter. Les règles de sélection sont acquises par arbre de décision sur les résultats de performance de 7 algorithmes testés dans 130 contextes différents. Les résultats montrent la supériorité de l'approche proposée sur les algorithmes utilisés séparément. En outre, les performances des règles de sélection apprises sont très proches de celles des règles acquises par expertise, ce qui conforte notre approche

INRIA a CCSD electronic archive server

Machine learning ensemble method for discovering knowledge from big data

Author: Farrash Majed
Publication venue
Publication date: 01/01/2016
Field of study

Big data, generated from various business internet and social media activities, has become a big challenge to researchers in the field of machine learning and data mining to develop new methods and techniques for analysing big data effectively and efficiently. Ensemble methods represent an attractive approach in dealing with the problem of mining large datasets because of their accuracy and ability of utilizing the divide-and-conquer mechanism in parallel computing environments. This research proposes a machine learning ensemble framework and implements it in a high performance computing environment. This research begins by identifying and categorising the effects of partitioned data subset size on ensemble accuracy when dealing with very large training datasets. Then an algorithm is developed to ascertain the patterns of the relationship between ensemble accuracy and the size of partitioned data subsets. The research concludes with the development of a selective modelling algorithm, which is an efficient alternative to static model selection methods for big datasets. The results show that maximising the size of partitioned data subsets does not necessarily improve the performance of an ensemble of classifiers that deal with large datasets. Identifying the patterns exhibited by the relationship between ensemble accuracy and partitioned data subset size facilitates the determination of the best subset size for partitioning huge training datasets. Finally, traditional model selection is inefficient in cases wherein large datasets are involved

University of East Anglia digital repository

Cooperative Training in Multiple Classifier Systems

Author: Dara Rozita Alaleh
Publication venue: 'University of Waterloo'
Publication date: 01/01/2007
Field of study

Multiple classifier system has shown to be an effective technique for classification. The success of multiple classifiers does not entirely depend on the base classifiers and/or the aggregation technique. Other parameters, such as training data, feature attributes, and correlation among the base classifiers may also contribute to the success of multiple classifiers. In addition, interaction of these parameters with each other may have an impact on multiple classifiers performance. In the present study, we intended to examine some of these interactions and investigate further the effects of these interactions on the performance of classifier ensembles. The proposed research introduces a different direction in the field of multiple classifiers systems. We attempt to understand and compare ensemble methods from the cooperation perspective. In this thesis, we narrowed down our focus on cooperation at training level. We first developed measures to estimate the degree and type of cooperation among training data partitions. These evaluation measures enabled us to evaluate the diversity and correlation among a set of disjoint and overlapped partitions. With the aid of properly selected measures and training information, we proposed two new data partitioning approaches: Cluster, De-cluster, and Selection (CDS) and Cooperative Cluster, De-cluster, and Selection (CO-CDS). In the end, a comprehensive comparative study was conducted where we compared our proposed training approaches with several other approaches in terms of robustness of their usage, resultant classification accuracy and classification stability. Experimental assessment of CDS and CO-CDS training approaches validates their robustness as compared to other training approaches. In addition, this study suggests that: 1) cooperation is generally beneficial and 2) classifier ensembles that cooperate through sharing information have higher generalization ability compared to the ones that do not share training information

University of Waterloo's Institutional Repository

SPS, Chennai 2 Distributed learning with bagging-like performance

Author: Clayton Springer C
Kevin W. Bowyer B
Lawrence O. Hall A
Nitesh V. Chawla A
Philip Kegelmeyer C
Thomas E. Moore A
Publication venue
Publication date
Field of study

10 Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A 11 simple alternative to bagging is to partition the data into disjoint subsets. Experiments with decision tree and neural 12 network classifiers on various datasets show that, given the same size partitions and bags, disjoint partitions result in 13 performance equivalent to, or better than, bootstrap aggregates (bags). Many applications (e.g., protein structure 14 prediction) involve use of datasets that are too large to handle in the memory of the typical computer. Hence, bagging 15 with samples the size of the data is impractical. Our results indicate that, in such applications, the simple approach of 16 creating a committee of n classifiers from disjoint partitions each of size 1=n (which will be memory resident during 17 learning) in a distributed way results in a classifier which has a bagging-like performance gain. The use of distributed 18 disjoint partitions in learning is significantly less complex and faster than bagging

CiteSeerX