14 research outputs found

    QuicK-means: Acceleration of K-means by learning a fast transform

    Get PDF
    K-means -- and the celebrated Lloyd algorithm -- is more than the clustering method it was originally designed to be. It has indeed proven pivotal to help increase the speed of many machine learning and data analysis techniques such as indexing, nearest-neighbor search and prediction, data compression, Radial Basis Function networks; its beneficial use has been shown to carry over to the acceleration of kernel machines (when using the Nyström method). Here, we propose a fast extension of K-means, dubbed QuicK-means, that rests on the idea of expressing the matrix of the KK centroids as a product of sparse matrices, a feat made possible by recent results devoted to find approximations of matrices as a product of sparse factors. Using such a decomposition squashes the complexity of the matrix-vector product between the factorized K×DK \times D centroid matrix U\mathbf{U} and any vector from O(KD)\mathcal{O}(K D) to O(Alog⁥A+B)\mathcal{O}(A \log A+B), with A=min⁥(K,D)A=\min (K, D) and B=max⁥(K,D)B=\max (K, D), where DD is the dimension of the training data. This drastic computational saving has a direct impact in the assignment process of a point to a cluster, meaning that it is not only tangible at prediction time, but also at training time, provided the factorization procedure is performed during Lloyd's algorithm. We precisely show that resorting to a factorization step at each iteration does not impair the convergence of the optimization scheme and that, depending on the context, it may entail a reduction of the training time. Finally, we provide discussions and numerical simulations that show the versatility of our computationally-efficient QuicK-means algorithm

    Compressive Clustering with an Optical Processing Unit

    Get PDF
    National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier features for sketching, and adapt the overall compressive clustering pipeline to this setting. We also propose some tools to help tuning a critical hyper-parameter of compressive clustering.Nous proposons une procédure de sketching pour le clustering compressif utilisant des processeurs optiques (OPU) pour calculer les composantes de Fourier alatoires. Cette procédure fait intervenir une stratégie novatrice qui permet de choisir efficacement l'échelle du sketch

    Deepström: Ž Emulsion de noyaux et d'apprentissage profond

    Get PDF
    International audienceLes modÚles à base de méthodes à noyaux et d'apprentissage profond ont essentiellement été étudiés séparemment jusqu'à aujourd'hui. Des travaux récents se sont focalisés sur la combinaison de ces deux approches afin de tirer parti du meilleur de chacune d'elles. Dans cette optique, nous introduisons une nouvelle architecture de réseaux de neurones qui bénéficie du faible coût en espace et en temps de l'approximation de Nyström. Nous montrons que cette architecture atteint une performance du niveau de l'état de l'art sur la classification d'images des jeux de données MNIST et CIFAR10 tout en ne nécessitant qu'un nombre réduit de paramÚtres

    Sparse approximations and kernel methods for machine learning model compression

    No full text
    Cette thĂšse a pour objectif d’étudier et de valider expĂ©rimentalement les bĂ©nĂ©fices, en terme de quantitĂ© de calcul et de donnĂ©es nĂ©cessaires, que peuvent apporter les mĂ©thodes Ă  noyaux et les mĂ©thodes d’approximation parcimonieuses Ă  des algorithmes d’apprentissage existant. Dans une premiĂšre partie de cette thĂšse, nous proposons un nouveau type d’architecture neuronale qui fait intervenir une fonction noyau afin d’en rĂ©duire le nombre de paramĂštres Ă  apprendre, ce qui permet de la rendre robuste au sur-apprentissage dans un rĂ©gime oĂč peu de donnĂ©es annotĂ©es sont disponibles. Dans une seconde partie de cette thĂšse, nous cherchons Ă  rĂ©duire la complexitĂ© de modĂšles d’apprentissage existants en y incluant des approximations parcimonieuses. D’abord, nous proposons un algorithme alternatif Ă  l’algorithme des K-moyennes qui permet d’en accĂ©lĂ©rer la phase d’infĂ©rence grĂące Ă  l’expression des centroides sous forme d’un produit de matrices parcimonieuses. En plus des garanties de convergence de l’algorithme proposĂ©, nous apportons une validation expĂ©rimentale de la qualitĂ© des centroides ainsi exprimĂ©s et de leur bĂ©nĂ©fice en terme de coĂ»t calculatoire. Ensuite, nous explorons la compression de rĂ©seaux neuronaux par le remplacement des matrices qui le constituent avec des dĂ©composition parcimonieuses. Enfin, nous dĂ©tournons l’algorithme d’approximation parcimonieuse OMP pour faire une sĂ©lection pondĂ©rĂ©e des arbres de dĂ©cision d’une forĂȘt alĂ©atoire, nous analysons l’effet des poids obtenus et proposons par ailleurs une alternative non-nĂ©gative de la mĂ©thode qui surpasse toutes les autres techniques de sĂ©lection d’arbres considĂ©rĂ©es sur un large panel de jeux de donnĂ©es.This thesis aims at studying and experimentally validating the benefits, in terms of amount of computation and data needed, that kernel methods and sparse approximation methods can bring to existing machine learning algorithms. In a first part of this thesis, we propose a new type of neural architecture that uses a kernel function to reduce the number of learnable parameters, thus making it robust to overfiting in a regime where few labeled observations are available. In a second part of this thesis, we seek to reduce the complexity of existing machine learning models by including sparse approximations. First, we propose an alternative algorithm to the K-means algorithm which allows to speed up the inference phase by expressing the centroids as a product of sparse matrices. In addition to the convergence guarantees of the proposed algorithm, we provide an experimental validation of both the quality of the centroids thus expressed and their benefit in terms of computational cost. Then, we explore the compression of neural networks by replacing the matrices that constitute its layers with sparse matrix products. Finally, we hijack the Orthogonal Matching Pursuit (OMP) sparse approximation algorithm to make a weighted selection of decisiontrees from a random forest, we analyze the effect of the weights obtained and we propose a non-negative alternative to the method that outperforms all other tree selectiontechniques considered on a large panel of data sets

    Compressive Clustering with an Optical Processing Unit

    Get PDF
    National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier features for sketching, and adapt the overall compressive clustering pipeline to this setting. We also propose some tools to help tuning a critical hyper-parameter of compressive clustering.Nous proposons une procédure de sketching pour le clustering compressif utilisant des processeurs optiques (OPU) pour calculer les composantes de Fourier alatoires. Cette procédure fait intervenir une stratégie novatrice qui permet de choisir efficacement l'échelle du sketch

    Compressive Clustering with an Optical Processing Unit

    No full text
    National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier features for sketching, and adapt the overall compressive clustering pipeline to this setting. We also propose some tools to help tuning a critical hyper-parameter of compressive clustering.Nous proposons une procédure de sketching pour le clustering compressif utilisant des processeurs optiques (OPU) pour calculer les composantes de Fourier alatoires. Cette procédure fait intervenir une stratégie novatrice qui permet de choisir efficacement l'échelle du sketch

    Compressive Clustering with an Optical Processing Unit

    No full text
    National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier features for sketching, and adapt the overall compressive clustering pipeline to this setting. We also propose some tools to help tuning a critical hyper-parameter of compressive clustering.Nous proposons une procédure de sketching pour le clustering compressif utilisant des processeurs optiques (OPU) pour calculer les composantes de Fourier alatoires. Cette procédure fait intervenir une stratégie novatrice qui permet de choisir efficacement l'échelle du sketch

    Deep Networks with Adaptive Nyström Approximation

    No full text
    International audienceRecent work has focused on combining kernel methods and deep learning to exploit the best of the two approaches. Here, we introduce a new architecture of neural networks in which we replace the top dense layers of standard convolutional architectures with an approximation of a kernel function by relying on the Nyström approximation. Our approach is easy and highly flexible. It is compatible with any kernel function and it allows exploiting multiple kernels. We show that our architecture has the same performance than standard architecture on datasets like SVHN and CIFAR100. One benefit of the method lies in its limited number of learnable parameters which makes it particularly suited for small training set sizes, e.g. from 5 to 20 samples per class
    corecore