1,508 research outputs found
Learning Sets with Separating Kernels
We consider the problem of learning a set from random samples. We show how
relevant geometric and topological properties of a set can be studied
analytically using concepts from the theory of reproducing kernel Hilbert
spaces. A new kind of reproducing kernel, that we call separating kernel, plays
a crucial role in our study and is analyzed in detail. We prove a new analytic
characterization of the support of a distribution, that naturally leads to a
family of provably consistent regularized learning algorithms and we discuss
the stability of these methods with respect to random sampling. Numerical
experiments show that the approach is competitive, and often better, than other
state of the art techniques.Comment: final versio
Positive Definite Kernels in Machine Learning
This survey is an introduction to positive definite kernels and the set of
methods they have inspired in the machine learning literature, namely kernel
methods. We first discuss some properties of positive definite kernels as well
as reproducing kernel Hibert spaces, the natural extension of the set of
functions associated with a kernel defined
on a space . We discuss at length the construction of kernel
functions that take advantage of well-known statistical models. We provide an
overview of numerous data-analysis methods which take advantage of reproducing
kernel Hilbert spaces and discuss the idea of combining several kernels to
improve the performance on certain tasks. We also provide a short cookbook of
different kernels which are particularly useful for certain data-types such as
images, graphs or speech segments.Comment: draft. corrected a typo in figure
Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods
Feature extraction and dimensionality reduction are important tasks in many
fields of science dealing with signal processing and analysis. The relevance of
these techniques is increasing as current sensory devices are developed with
ever higher resolution, and problems involving multimodal data sources become
more common. A plethora of feature extraction methods are available in the
literature collectively grouped under the field of Multivariate Analysis (MVA).
This paper provides a uniform treatment of several methods: Principal Component
Analysis (PCA), Partial Least Squares (PLS), Canonical Correlation Analysis
(CCA) and Orthonormalized PLS (OPLS), as well as their non-linear extensions
derived by means of the theory of reproducing kernel Hilbert spaces. We also
review their connections to other methods for classification and statistical
dependence estimation, and introduce some recent developments to deal with the
extreme cases of large-scale and low-sized problems. To illustrate the wide
applicability of these methods in both classification and regression problems,
we analyze their performance in a benchmark of publicly available data sets,
and pay special attention to specific real applications involving audio
processing for music genre prediction and hyperspectral satellite images for
Earth and climate monitoring
HSIC Regularized LTSA
Hilbert-Schmidt Independence Criterion (HSIC) measures statistical independence between two random variables. However, instead of measuring the statistical independence between two random variables directly, HSIC first transforms two random variables into two Reproducing Kernel Hilbert Spaces (RKHS) respectively and then measures the kernelled random variables by using Hilbert-Schmidt (HS) operators between the two RKHS. Since HSIC was first proposed around 2005, HSIC has found wide applications in machine learning. In this paper, a HSIC regularized Local Tangent Space Alignment algorithm (HSIC-LTSA) is proposed. LTSA is a well-known dimensionality reduction algorithm for local homeomorphism preservation. In HSIC-LTSA, behind the objective function of LTSA, HSIC between high-dimensional and dimension-reduced data is added as a regularization term. The proposed HSIC-LTSA has two contributions. First, HSIC-LTSA implements local homeomorphism preservation and global statistical correlation during dimensionality reduction. Secondly, HSIC-LTSA proposes a new way to apply HSIC: HSIC is used as a regularization term to be added to other machine learning algorithms. The experimental results presented in this paper show that HSIC-LTSA can achieve better performance than the original LTSA
On Invariance and Selectivity in Representation Learning
We discuss data representation which can be learned automatically from data,
are invariant to transformations, and at the same time selective, in the sense
that two points have the same representation only if they are one the
transformation of the other. The mathematical results here sharpen some of the
key claims of i-theory -- a recent theory of feedforward processing in sensory
cortex
- …