20,811 research outputs found
Out-of-sample generalizations for supervised manifold learning for classification
Supervised manifold learning methods for data classification map data samples
residing in a high-dimensional ambient space to a lower-dimensional domain in a
structure-preserving way, while enhancing the separation between different
classes in the learned embedding. Most nonlinear supervised manifold learning
methods compute the embedding of the manifolds only at the initially available
training points, while the generalization of the embedding to novel points,
known as the out-of-sample extension problem in manifold learning, becomes
especially important in classification applications. In this work, we propose a
semi-supervised method for building an interpolation function that provides an
out-of-sample extension for general supervised manifold learning algorithms
studied in the context of classification. The proposed algorithm computes a
radial basis function (RBF) interpolator that minimizes an objective function
consisting of the total embedding error of unlabeled test samples, defined as
their distance to the embeddings of the manifolds of their own class, as well
as a regularization term that controls the smoothness of the interpolation
function in a direction-dependent way. The class labels of test data and the
interpolation function parameters are estimated jointly with a progressive
procedure. Experimental results on face and object images demonstrate the
potential of the proposed out-of-sample extension algorithm for the
classification of manifold-modeled data sets
Are screening methods useful in feature selection? An empirical study
Filter or screening methods are often used as a preprocessing step for
reducing the number of variables used by a learning algorithm in obtaining a
classification or regression model. While there are many such filter methods,
there is a need for an objective evaluation of these methods. Such an
evaluation is needed to compare them with each other and also to answer whether
they are at all useful, or a learning algorithm could do a better job without
them. For this purpose, many popular screening methods are partnered in this
paper with three regression learners and five classification learners and
evaluated on ten real datasets to obtain accuracy criteria such as R-square and
area under the ROC curve (AUC). The obtained results are compared through curve
plots and comparison tables in order to find out whether screening methods help
improve the performance of learning algorithms and how they fare with each
other. Our findings revealed that the screening methods were useful in
improving the prediction of the best learner on two regression and two
classification datasets out of the ten datasets evaluated.Comment: 29 pages, 4 figures, 21 table
Nonlinear Supervised Dimensionality Reduction via Smooth Regular Embeddings
The recovery of the intrinsic geometric structures of data collections is an
important problem in data analysis. Supervised extensions of several manifold
learning approaches have been proposed in the recent years. Meanwhile, existing
methods primarily focus on the embedding of the training data, and the
generalization of the embedding to initially unseen test data is rather
ignored. In this work, we build on recent theoretical results on the
generalization performance of supervised manifold learning algorithms.
Motivated by these performance bounds, we propose a supervised manifold
learning method that computes a nonlinear embedding while constructing a smooth
and regular interpolation function that extends the embedding to the whole data
space in order to achieve satisfactory generalization. The embedding and the
interpolator are jointly learnt such that the Lipschitz regularity of the
interpolator is imposed while ensuring the separation between different
classes. Experimental results on several image data sets show that the proposed
method outperforms traditional classifiers and the supervised dimensionality
reduction algorithms in comparison in terms of classification accuracy in most
settings
- …