1,508 research outputs found

    Learning Sets with Separating Kernels

    Full text link
    We consider the problem of learning a set from random samples. We show how relevant geometric and topological properties of a set can be studied analytically using concepts from the theory of reproducing kernel Hilbert spaces. A new kind of reproducing kernel, that we call separating kernel, plays a crucial role in our study and is analyzed in detail. We prove a new analytic characterization of the support of a distribution, that naturally leads to a family of provably consistent regularized learning algorithms and we discuss the stability of these methods with respect to random sampling. Numerical experiments show that the approach is competitive, and often better, than other state of the art techniques.Comment: final versio

    Positive Definite Kernels in Machine Learning

    Full text link
    This survey is an introduction to positive definite kernels and the set of methods they have inspired in the machine learning literature, namely kernel methods. We first discuss some properties of positive definite kernels as well as reproducing kernel Hibert spaces, the natural extension of the set of functions {k(x,⋅),x∈X}\{k(x,\cdot),x\in\mathcal{X}\} associated with a kernel kk defined on a space X\mathcal{X}. We discuss at length the construction of kernel functions that take advantage of well-known statistical models. We provide an overview of numerous data-analysis methods which take advantage of reproducing kernel Hilbert spaces and discuss the idea of combining several kernels to improve the performance on certain tasks. We also provide a short cookbook of different kernels which are particularly useful for certain data-types such as images, graphs or speech segments.Comment: draft. corrected a typo in figure

    Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods

    Full text link
    Feature extraction and dimensionality reduction are important tasks in many fields of science dealing with signal processing and analysis. The relevance of these techniques is increasing as current sensory devices are developed with ever higher resolution, and problems involving multimodal data sources become more common. A plethora of feature extraction methods are available in the literature collectively grouped under the field of Multivariate Analysis (MVA). This paper provides a uniform treatment of several methods: Principal Component Analysis (PCA), Partial Least Squares (PLS), Canonical Correlation Analysis (CCA) and Orthonormalized PLS (OPLS), as well as their non-linear extensions derived by means of the theory of reproducing kernel Hilbert spaces. We also review their connections to other methods for classification and statistical dependence estimation, and introduce some recent developments to deal with the extreme cases of large-scale and low-sized problems. To illustrate the wide applicability of these methods in both classification and regression problems, we analyze their performance in a benchmark of publicly available data sets, and pay special attention to specific real applications involving audio processing for music genre prediction and hyperspectral satellite images for Earth and climate monitoring

    HSIC Regularized LTSA

    Get PDF
    Hilbert-Schmidt Independence Criterion (HSIC) measures statistical independence between two random variables. However, instead of measuring the statistical independence between two random variables directly, HSIC first transforms two random variables into two Reproducing Kernel Hilbert Spaces (RKHS) respectively and then measures the kernelled random variables by using Hilbert-Schmidt (HS) operators between the two RKHS. Since HSIC was first proposed around 2005, HSIC has found wide applications in machine learning. In this paper, a HSIC regularized Local Tangent Space Alignment algorithm (HSIC-LTSA) is proposed. LTSA is a well-known dimensionality reduction algorithm for local homeomorphism preservation. In HSIC-LTSA, behind the objective function of LTSA, HSIC between high-dimensional and dimension-reduced data is added as a regularization term. The proposed HSIC-LTSA has two contributions. First, HSIC-LTSA implements local homeomorphism preservation and global statistical correlation during dimensionality reduction. Secondly, HSIC-LTSA proposes a new way to apply HSIC: HSIC is used as a regularization term to be added to other machine learning algorithms. The experimental results presented in this paper show that HSIC-LTSA can achieve better performance than the original LTSA

    On Invariance and Selectivity in Representation Learning

    Get PDF
    We discuss data representation which can be learned automatically from data, are invariant to transformations, and at the same time selective, in the sense that two points have the same representation only if they are one the transformation of the other. The mathematical results here sharpen some of the key claims of i-theory -- a recent theory of feedforward processing in sensory cortex
    • …
    corecore