1,490 research outputs found

    A study of the classification of low-dimensional data with supervised manifold learning

    Full text link
    Supervised manifold learning methods learn data representations by preserving the geometric structure of data while enhancing the separation between data samples from different classes. In this work, we propose a theoretical study of supervised manifold learning for classification. We consider nonlinear dimensionality reduction algorithms that yield linearly separable embeddings of training data and present generalization bounds for this type of algorithms. A necessary condition for satisfactory generalization performance is that the embedding allow the construction of a sufficiently regular interpolation function in relation with the separation margin of the embedding. We show that for supervised embeddings satisfying this condition, the classification error decays at an exponential rate with the number of training samples. Finally, we examine the separability of supervised nonlinear embeddings that aim to preserve the low-dimensional geometric structure of data based on graph representations. The proposed analysis is supported by experiments on several real data sets

    Spectral Dimensionality Reduction

    Get PDF
    In this paper, we study and put under a common framework a number of non-linear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian Eigenmaps and kernel PCA, which are based on performing an eigen-decomposition (hence the name 'spectral'). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering. We show that in all of these cases the learning algorithm estimates the principal eigenfunctions of an operator that depends on the unknown data density and on a kernel that is not necessarily positive semi-definite. This helps to generalize some of these algorithms so as to predict an embedding for out-of-sample examples without having to retrain the model. It also makes it more transparent what these algorithm are minimizing on the empirical data and gives a corresponding notion of generalization error. Dans cet article, nous étudions et développons un cadre unifié pour un certain nombre de méthodes non linéaires de réduction de dimensionalité, telles que LLE, Isomap, LE (Laplacian Eigenmap) et ACP à noyaux, qui font de la décomposition en valeurs propres (d'où le nom "spectral"). Ce cadre inclut également des méthodes classiques telles que l'ACP et l'échelonnage multidimensionnel métrique (MDS). Il inclut aussi l'étape de transformation de données utilisée dans l'agrégation spectrale. Nous montrons que, dans tous les cas, l'algorithme d'apprentissage estime les fonctions propres principales d'un opérateur qui dépend de la densité inconnue de données et d'un noyau qui n'est pas nécessairement positif semi-défini. Ce cadre aide à généraliser certains modèles pour prédire les coordonnées des exemples hors-échantillons sans avoir à réentraîner le modèle. Il aide également à rendre plus transparent ce que ces algorithmes minimisent sur les données empiriques et donne une notion correspondante d'erreur de généralisation.non-parametric models, non-linear dimensionality reduction, kernel models, modèles non paramétriques, réduction de dimensionalité non linéaire, modèles à noyau
    • …
    corecore