Search CORE

14 research outputs found

QuicK-means: Acceleration of K-means by learning a fast transform

Author: Emiya Valentin
Giffon Luc
Kadri Hachem
Ralaivola Liva
Publication venue: HAL CCSD
Publication date: 27/11/2019
Field of study

K-means -- and the celebrated Lloyd algorithm -- is more than the clustering method it was originally designed to be. It has indeed proven pivotal to help increase the speed of many machine learning and data analysis techniques such as indexing, nearest-neighbor search and prediction, data compression, Radial Basis Function networks; its beneficial use has been shown to carry over to the acceleration of kernel machines (when using the Nyström method). Here, we propose a fast extension of K-means, dubbed QuicK-means, that rests on the idea of expressing the matrix of the

K

centroids as a product of sparse matrices, a feat made possible by recent results devoted to find approximations of matrices as a product of sparse factors. Using such a decomposition squashes the complexity of the matrix-vector product between the factorized

K \times D

centroid matrix

\mathbf{U}

and any vector from

\mathcal{O}(K D)

\mathcal{O}(A \log A+B)

, with

A=\min (K, D)

and

B=\max (K, D)

, where

D

is the dimension of the training data. This drastic computational saving has a direct impact in the assignment process of a point to a cluster, meaning that it is not only tangible at prediction time, but also at training time, provided the factorization procedure is performed during Lloyd's algorithm. We precisely show that resorting to a factorization step at each iteration does not impair the convergence of the optimization scheme and that, depending on the context, it may entail a reduction of the training time. Finally, we provide discussions and numerical simulations that show the versatility of our computationally-efficient QuicK-means algorithm

Compressive Clustering with an Optical Processing Unit

Author: Giffon Luc
Gribonval Rémi
Publication venue: HAL CCSD
Publication date: 06/09/2022
Field of study

National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier features for sketching, and adapt the overall compressive clustering pipeline to this setting. We also propose some tools to help tuning a critical hyper-parameter of compressive clustering.Nous proposons une procédure de sketching pour le clustering compressif utilisant des processeurs optiques (OPU) pour calculer les composantes de Fourier alatoires. Cette procédure fait intervenir une stratégie novatrice qui permet de choisir efficacement l'échelle du sketch

INRIA a CCSD electronic archive server

Deepström: ´ Emulsion de noyaux et d'apprentissage profond

Author: Artières Thierry
Ayache Stéphane
Giffon Luc
Kadri Hachem
Publication venue: HAL CCSD
Publication date: 19/06/2018
Field of study

International audienceLes modèles à base de méthodes à noyaux et d'apprentissage profond ont essentiellement été étudiés séparemment jusqu'à aujourd'hui. Des travaux récents se sont focalisés sur la combinaison de ces deux approches afin de tirer parti du meilleur de chacune d'elles. Dans cette optique, nous introduisons une nouvelle architecture de réseaux de neurones qui bénéficie du faible coût en espace et en temps de l'approximation de Nyström. Nous montrons que cette architecture atteint une performance du niveau de l'état de l'art sur la classification d'images des jeux de données MNIST et CIFAR10 tout en ne nécessitant qu'un nombre réduit de paramètres

Sparse approximations and kernel methods for machine learning model compression

Author: Giffon Luc
Publication venue
Publication date: 18/12/2020
Field of study

Cette thèse a pour objectif d’étudier et de valider expérimentalement les bénéfices, en terme de quantité de calcul et de données nécessaires, que peuvent apporter les méthodes à noyaux et les méthodes d’approximation parcimonieuses à des algorithmes d’apprentissage existant. Dans une première partie de cette thèse, nous proposons un nouveau type d’architecture neuronale qui fait intervenir une fonction noyau afin d’en réduire le nombre de paramètres à apprendre, ce qui permet de la rendre robuste au sur-apprentissage dans un régime où peu de données annotées sont disponibles. Dans une seconde partie de cette thèse, nous cherchons à réduire la complexité de modèles d’apprentissage existants en y incluant des approximations parcimonieuses. D’abord, nous proposons un algorithme alternatif à l’algorithme des K-moyennes qui permet d’en accélérer la phase d’inférence grâce à l’expression des centroides sous forme d’un produit de matrices parcimonieuses. En plus des garanties de convergence de l’algorithme proposé, nous apportons une validation expérimentale de la qualité des centroides ainsi exprimés et de leur bénéfice en terme de coût calculatoire. Ensuite, nous explorons la compression de réseaux neuronaux par le remplacement des matrices qui le constituent avec des décomposition parcimonieuses. Enfin, nous détournons l’algorithme d’approximation parcimonieuse OMP pour faire une sélection pondérée des arbres de décision d’une forêt aléatoire, nous analysons l’effet des poids obtenus et proposons par ailleurs une alternative non-négative de la méthode qui surpasse toutes les autres techniques de sélection d’arbres considérées sur un large panel de jeux de données.This thesis aims at studying and experimentally validating the benefits, in terms of amount of computation and data needed, that kernel methods and sparse approximation methods can bring to existing machine learning algorithms. In a first part of this thesis, we propose a new type of neural architecture that uses a kernel function to reduce the number of learnable parameters, thus making it robust to overfiting in a regime where few labeled observations are available. In a second part of this thesis, we seek to reduce the complexity of existing machine learning models by including sparse approximations. First, we propose an alternative algorithm to the K-means algorithm which allows to speed up the inference phase by expressing the centroids as a product of sparse matrices. In addition to the convergence guarantees of the proposed algorithm, we provide an experimental validation of both the quality of the centroids thus expressed and their benefit in terms of computational cost. Then, we explore the compression of neural networks by replacing the matrices that constitute its layers with sparse matrix products. Finally, we hijack the Orthogonal Matching Pursuit (OMP) sparse approximation algorithm to make a weighted selection of decisiontrees from a random forest, we analyze the effect of the weights obtained and we propose a non-negative alternative to the method that outperforms all other tree selectiontechniques considered on a large panel of data sets

Theses.fr

Compressive Clustering with an Optical Processing Unit

Author: Giffon Luc
Gribonval Rémi
Publication venue: HAL CCSD
Publication date: 06/09/2022
Field of study

HAL-ENS-LYON

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL

HAL-Lyon 3

Hal-Diderot

Compressive Clustering with an Optical Processing Unit

Author: Giffon Luc
Gribonval Rémi
Publication venue: HAL CCSD
Publication date: 06/09/2022
Field of study

Hal - Université Grenoble Alpes

Compressive Clustering with an Optical Processing Unit

Author: Giffon Luc
Gribonval Rémi
Publication venue: HAL CCSD
Publication date: 06/09/2022
Field of study

HAL-Lyon 3

Deep Networks with Adaptive Nyström Approximation

Author: Artières Thierry
Ayache Stéphane
Giffon Luc
Kadri Hachem
Publication venue: HAL CCSD
Publication date: 14/07/2019
Field of study

International audienceRecent work has focused on combining kernel methods and deep learning to exploit the best of the two approaches. Here, we introduce a new architecture of neural networks in which we replace the top dense layers of standard convolutional architectures with an approximation of a kernel function by relying on the Nyström approximation. Our approach is easy and highly flexible. It is compatible with any kernel function and it allows exploiting multiple kernels. We show that our architecture has the same performance than standard architecture on datasets like SVHN and CIFAR100. One benefit of the method lies in its limited number of learnable parameters which makes it particularly suited for small training set sizes, e.g. from 5 to 20 samples per class

arXiv.org e-Print Archive

Crossref

HAL AMU

INRIA a CCSD electronic archive server