20,682 research outputs found
Manifold Parzen Windows
The similarity between objects is a fundamental element of many learning algorithms. Most non-parametric methods take this similarity to be fixed, but much recent work has shown the advantages of learning it, in particular to exploit the local invariances in the data or to capture the possibly non-linear manifold on which most of the data lies. We propose a new non-parametric kernel density estimation method which captures the local structure of an underlying manifold through the leading eigenvectors of regularized local covariance matrices. Experiments in density estimation show significant improvements with respect to Parzen density estimators. The density estimators can also be used within Bayes classifiers, yielding classification rates similar to SVMs and much superior to the Parzen classifier. La similarité entre objets est un élément fondamental de plusieurs algorithmes d'apprentissage. La plupart des méthodes non paramétriques supposent cette similarité constante, mais des travaux récents ont montré les avantages de les apprendre, en particulier pour exploiter les invariances locales dans les données ou pour capturer la variété possiblement non linéaire sur laquelle reposent la plupart des données. Nous proposons une nouvelle méthode d'estimation de densité à noyau non paramétrique qui capture la structure locale d'une variété sous-jacente en utilisant les vecteurs propres principaux de matrices de covariance locales régularisées. Les expériences d'estimation de densité montrent une amélioration significative sur les estimateurs de densité de Parzen. Les estimateurs de densité peuvent aussi être utilisés à l'intérieur de classificateurs de Bayes, menant à des taux de classification similaires à ceux des SVMs, et très supérieurs au classificateur de Parzen.density estimation, non-parametric models, manifold models, probabilistic classifiers, estimation de densité, modèles non paramétriques, modèles de variétés, classification probabiliste
Local Component Analysis
Kernel density estimation, a.k.a. Parzen windows, is a popular density
estimation method, which can be used for outlier detection or clustering. With
multivariate data, its performance is heavily reliant on the metric used within
the kernel. Most earlier work has focused on learning only the bandwidth of the
kernel (i.e., a scalar multiplicative factor). In this paper, we propose to
learn a full Euclidean metric through an expectation-minimization (EM)
procedure, which can be seen as an unsupervised counterpart to neighbourhood
component analysis (NCA). In order to avoid overfitting with a fully
nonparametric density estimator in high dimensions, we also consider a
semi-parametric Gaussian-Parzen density model, where some of the variables are
modelled through a jointly Gaussian density, while others are modelled through
Parzen windows. For these two models, EM leads to simple closed-form updates
based on matrix inversions and eigenvalue decompositions. We show empirically
that our method leads to density estimators with higher test-likelihoods than
natural competing methods, and that the metrics may be used within most
unsupervised learning techniques that rely on such metrics, such as spectral
clustering or manifold learning methods. Finally, we present a stochastic
approximation scheme which allows for the use of this method in a large-scale
setting
- …