13 research outputs found

    Studies on Kernel Learning and Independent Component Analysis

    Get PDF
    A crucial step in kernel-based learning is the selection of a proper kernel function or kernel matrix. Multiple kernel learning (MKL), in which a set of kernels are assessed during the learning time, was recently proposed to solve the kernel selection problem. The goal is to estimate a suitable kernel matrix by adjusting a linear combination of the given kernels so that the empirical risk is minimized. MKL is usually a memory demanding optimization problem, which becomes a barrier for large samples. This study proposes an efficient method for kernel learning by using the low rank property of large kernel matrices which is often observed in applications. The proposed method involves selecting a few eigenvectors of kernel bases and taking a sparse combination of them by minimizing the empirical risk. Empirical results show that the computational demands decrease significantly without compromising classification accuracy, when compared with previous MKL methods. Computing an upper bound for complexity of the hypothesis set generated by the learned kernel as above is challenging. Here, a novel bound is presented which shows that the Gaussian complexity of such hypothesis set is controlled by the logarithm of the number of involved eigenvectors and their maximum distance, i.e. the geometry of the basis set. This geometric bound sheds more light on the selection of kernel bases, which could not be obtained from previous results. The rest of this study is a step toward utilizing the statistical learning theory to analyze independent component analysis estimators such as FastICA. This thesis provides a sample convergence analysis for FastICA estimator and shows that the estimations converge in distribution as the number of samples increase. Additionally, similar results for the bootstrap FastICA are established. A direct application of these results is to design a hypothesis testing to study the convergence of the estimates

    Kohinan varianssin estimointi funktion approksimoinnissa

    No full text
    Tuntematon kohina muodostaa erään tärkeimmistä oletuksista signaalin käsittelyn, tilastotieteen ja koneoppimisen ongelmissa. Kohina esitetään yleensä additiivisena tai multiplikatiivisena signaalina alkuperäiseen signaaliin. Kohina signaalissa aiheuttaa tarkasteltavaan ongelmaan epävarmuuksia, joita deterministiset lähestymistavat eivät usein pysty käsittelemään. Kohinan varianssi on alaraja mallin keskimääräiselle neliövirheelle. Siten varianssin estimointi auttaa arvioimaan mallin suorituskykyä tietylle havaintoaineistolle. Kohinan varianssi myös sisältää tietoa siitä, kuinka lähellä kohinaisen signaalin muoto on alkuperäistä signaalia. Tätä voidaan myös käyttää alarajana kun kohinaista signaalia suodatetaan sileämmäksi. Tässä työssä esitetään kohinan varianssin estimointiin liittyviä mahdollisuuksia. Menetelmien yksityiskohtia ja tilastollisia ominaisuuksia käsitellään sekä yksi- että moniulotteisissa ongelmissa. Myös menetelmien laskennallisia kompleksisuuksia ja käytännön rajoitteita käsitellään. Työssä vertaillaan menetelmien tarkkuutta käytännön ongelmien näkökulmasta. Vertailu perustuu simuloituun havaintoaineistoon, jossa kohinan jakaumaa ja tasoa muutetaan

    LS-SVM hyperparameter selection with a nonparametric noise estimator

    No full text
    This paper presents a new method for the selection of the two hyperparameters of Least Squares Support Vector Machine (LS-SVM) approximators with Gaussian Kernels. The two hyperparameters are the width sigma of the Gaussian kernels and the regularization parameter lambda. For different values of sigma, a Nonparametric Noise Estimator (NNE) is introduced to estimate the variance of the noise on the outputs. The NNE allows the determination of the best lambda for each given sigma. A Leave-one-out methodology is then applied to select the best sigma. Therefore, this method transforms the double optimization problem into a single optimization one. The method is tested on 2 problems: a toy example and the Pumadyn regression Benchmark

    Mutual information and gamma test for input selection

    No full text
    Abstract. In this paper, input selection is performed using two different approaches. The first approach is based on the Gamma test. This test estimates the mean square error (MSE) that can be achieved without overfitting. The best set of inputs is the one that minimises the result of the Gamma test. The second method estimates the Mutual Information between a set of inputs and the output. The best set of inputs is the one that maximises the Mutual Information. Both methods are applied for the selection of the inputs for function approximation and time series prediction problems.

    Methodology for long-term prediction of time series

    No full text
    In this paper, a global methodology for the long-term prediction of time series is proposed. This methodology combines direct prediction strategy and sophisticated input selection criteria: k-nearest neighbors approximation method (k-NN), mutual information (MI) and nonparametric noise estimation (NNE). A global input selection strategy that combines forward selection, backward elimination (or pruning) and forward–backward selection is introduced. This methodology is used to optimize the three input selection criteria (k-NN, MI and NNE). The methodology is successfully applied to a real life benchmark: the Poland Electricity Load dataset. r 2007 Elsevier B.V. All rights reserved

    2 Machine Learning Group,

    No full text
    Abstract. This paper presents a new method for the selection of the two hyperparameters of Least Squares Support Vector Machine (LS-SVM) approximators with Gaussian Kernels. The two hyperparameters are the width σ of the Gaussian kernels and the regularization parameter λ. For different values of σ, a Nonparametric Noise Estimator (NNE) is introduced to estimate the variance of the noise on the outputs. The NNE allows the determination of the best λ for each given σ. A Leave-one-out methodology is then applied to select the best σ. Therefore, this method transforms the double optimization problem into a single optimization one. The method is tested on 2 problems: a toy example and the Pumadyn regression Benchmark
    corecore