3,323 research outputs found
Convex Formulation for Kernel PCA and its Use in Semi-Supervised Learning
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this brief, kernel principal component analysis (KPCA) is reinterpreted as the solution to a convex optimization problem. Actually, there is a constrained convex problem for each principal component, so that the constraints guarantee that the principal component is indeed a solution, and not a mere saddle point. Although these insights do not imply any algorithmic improvement, they can be used to further understand the method, formulate possible extensions, and properly address them. As an example, a new convex optimization problem for semisupervised classification is proposed, which seems particularly well suited whenever the number of known labels is small. Our formulation resembles a least squares support vector machine problem with a regularization parameter multiplied by a negative sign, combined with a variational principle for KPCA. Our primal optimization principle for semisupervised learning is solved in terms of the Lagrange multipliers. Numerical experiments in several classification tasks illustrate the performance of the proposed model in problems with only a few labeled data.The authors thank the following organizations. • EU: The research leading
to these results has received funding from the European Research Council
under the European Union’s Seventh Framework Programme (FP7/2007-2013)
/ ERC AdG A-DATADRIVE-B (290923). This paper reflects only the authors’
views, the Union is not liable for any use that may be made of the contained
information. • Research Council KUL: GOA/10/09 MaNet, CoE PFV/10/002
(OPTEC), BIL12/11T; PhD/Postdoc grants. • Flemish Government: – FWO:
G.0377.12 (Structured systems), G.088114N (Tensor based data similarity);
PhD/Postdoc grants. – IWT: SBO POM (100031); PhD/Postdoc grants.
• iMinds Medical Information Technologies SBO 2014. • Belgian Federal
Science Policy Office: IUAP P7/19 (DYSCO, Dynamical systems, control
and optimization, 2012-2017)
Semi-Supervised Kernel PCA
We present three generalisations of Kernel Principal Components Analysis
(KPCA) which incorporate knowledge of the class labels of a subset of the data
points. The first, MV-KPCA, penalises within class variances similar to Fisher
discriminant analysis. The second, LSKPCA is a hybrid of least squares
regression and kernel PCA. The final LR-KPCA is an iteratively reweighted
version of the previous which achieves a sigmoid loss function on the labeled
points. We provide a theoretical risk bound as well as illustrative experiments
on real and toy data sets
Task-Driven Dictionary Learning
Modeling data with linear combinations of a few elements from a learned
dictionary has been the focus of much recent research in machine learning,
neuroscience and signal processing. For signals such as natural images that
admit such sparse representations, it is now well established that these models
are well suited to restoration tasks. In this context, learning the dictionary
amounts to solving a large-scale matrix factorization problem, which can be
done efficiently with classical optimization tools. The same approach has also
been used for learning features from data for other purposes, e.g., image
classification, but tuning the dictionary in a supervised way for these tasks
has proven to be more difficult. In this paper, we present a general
formulation for supervised dictionary learning adapted to a wide variety of
tasks, and present an efficient algorithm for solving the corresponding
optimization problem. Experiments on handwritten digit classification, digital
art identification, nonlinear inverse image problems, and compressed sensing
demonstrate that our approach is effective in large-scale settings, and is well
suited to supervised and semi-supervised classification, as well as regression
tasks for data that admit sparse representations.Comment: final draft post-refereein
Positive Definite Kernels in Machine Learning
This survey is an introduction to positive definite kernels and the set of
methods they have inspired in the machine learning literature, namely kernel
methods. We first discuss some properties of positive definite kernels as well
as reproducing kernel Hibert spaces, the natural extension of the set of
functions associated with a kernel defined
on a space . We discuss at length the construction of kernel
functions that take advantage of well-known statistical models. We provide an
overview of numerous data-analysis methods which take advantage of reproducing
kernel Hilbert spaces and discuss the idea of combining several kernels to
improve the performance on certain tasks. We also provide a short cookbook of
different kernels which are particularly useful for certain data-types such as
images, graphs or speech segments.Comment: draft. corrected a typo in figure
Optimal Transport for Domain Adaptation
Domain adaptation from one data space (or domain) to another is one of the
most challenging tasks of modern data analytics. If the adaptation is done
correctly, models built on a specific data space become more robust when
confronted to data depicting the same semantic concepts (the classes), but
observed by another observation system with its own specificities. Among the
many strategies proposed to adapt a domain to another, finding a common
representation has shown excellent properties: by finding a common
representation for both domains, a single classifier can be effective in both
and use labelled samples from the source domain to predict the unlabelled
samples of the target domain. In this paper, we propose a regularized
unsupervised optimal transportation model to perform the alignment of the
representations in the source and target domains. We learn a transportation
plan matching both PDFs, which constrains labelled samples in the source domain
to remain close during transport. This way, we exploit at the same time the few
labeled information in the source and the unlabelled distributions observed in
both domains. Experiments in toy and challenging real visual adaptation
examples show the interest of the method, that consistently outperforms state
of the art approaches
- …