Search CORE

13 research outputs found

Studies on Kernel Learning and Independent Component Analysis

Author: Reyhani Nima
Publication venue: Aalto-yliopisto
Publication date: 01/01/2013
Field of study

A crucial step in kernel-based learning is the selection of a proper kernel function or kernel matrix. Multiple kernel learning (MKL), in which a set of kernels are assessed during the learning time, was recently proposed to solve the kernel selection problem. The goal is to estimate a suitable kernel matrix by adjusting a linear combination of the given kernels so that the empirical risk is minimized. MKL is usually a memory demanding optimization problem, which becomes a barrier for large samples. This study proposes an efficient method for kernel learning by using the low rank property of large kernel matrices which is often observed in applications. The proposed method involves selecting a few eigenvectors of kernel bases and taking a sparse combination of them by minimizing the empirical risk. Empirical results show that the computational demands decrease significantly without compromising classification accuracy, when compared with previous MKL methods. Computing an upper bound for complexity of the hypothesis set generated by the learned kernel as above is challenging. Here, a novel bound is presented which shows that the Gaussian complexity of such hypothesis set is controlled by the logarithm of the number of involved eigenvectors and their maximum distance, i.e. the geometry of the basis set. This geometric bound sheds more light on the selection of kernel bases, which could not be obtained from previous results. The rest of this study is a step toward utilizing the statistical learning theory to analyze independent component analysis estimators such as FastICA. This thesis provides a sample convergence analysis for FastICA estimator and shows that the estimations converge in distribution as the number of samples increase. Additionally, similar results for the bootstrap FastICA are established. A direct application of these results is to design a hypothesis testing to study the convergence of the estimates

Aaltodoc Publication Archive

Cancer-Predicting Gene Expression Changes in Colonic Mucosa of Western Diet Fed Mlh1(+/-) Mice

Author: Dermadi Bebek Denis
Mutanen Marja
Nyström Minna
Ollila Saara
Peltomäki Päivi T
Pussila Marjaana
Päivärinta Essi Mari-Anna
Reyhani Nima
Sarantaus Laura
Valo Satu
Publication venue
Publication date: 01/01/2013
Field of study

Peer reviewe

Crossref

Directory of Open Access Journals

PubMed Central

Aaltodoc Publication Archive

Helsingin yliopiston digitaalinen arkisto

FigShare

Kohinan varianssin estimointi funktion approksimoinnissa

Author: Reyhani Nima
Publication venue
Publication date: 01/01/2006
Field of study

Tuntematon kohina muodostaa erään tärkeimmistä oletuksista signaalin käsittelyn, tilastotieteen ja koneoppimisen ongelmissa. Kohina esitetään yleensä additiivisena tai multiplikatiivisena signaalina alkuperäiseen signaaliin. Kohina signaalissa aiheuttaa tarkasteltavaan ongelmaan epävarmuuksia, joita deterministiset lähestymistavat eivät usein pysty käsittelemään. Kohinan varianssi on alaraja mallin keskimääräiselle neliövirheelle. Siten varianssin estimointi auttaa arvioimaan mallin suorituskykyä tietylle havaintoaineistolle. Kohinan varianssi myös sisältää tietoa siitä, kuinka lähellä kohinaisen signaalin muoto on alkuperäistä signaalia. Tätä voidaan myös käyttää alarajana kun kohinaista signaalia suodatetaan sileämmäksi. Tässä työssä esitetään kohinan varianssin estimointiin liittyviä mahdollisuuksia. Menetelmien yksityiskohtia ja tilastollisia ominaisuuksia käsitellään sekä yksi- että moniulotteisissa ongelmissa. Myös menetelmien laskennallisia kompleksisuuksia ja käytännön rajoitteita käsitellään. Työssä vertaillaan menetelmien tarkkuutta käytännön ongelmien näkökulmasta. Vertailu perustuu simuloituun havaintoaineistoon, jossa kohinan jakaumaa ja tasoa muutetaan

Aaltodoc Publication Archive

LS-SVM hyperparameter selection with a nonparametric noise estimator

Author: Ji Jongnan
Lendasse Amaury
Reyhani Nima
Verleysen Michel
Publication venue: Springer-verlag Berlin
Publication date: 01/01/2005
Field of study

This paper presents a new method for the selection of the two hyperparameters of Least Squares Support Vector Machine (LS-SVM) approximators with Gaussian Kernels. The two hyperparameters are the width sigma of the Gaussian kernels and the regularization parameter lambda. For different values of sigma, a Nonparametric Noise Estimator (NNE) is introduced to estimate the variance of the noise on the outputs. The NNE allows the determination of the best lambda for each given sigma. A Leave-one-out methodology is then applied to select the best sigma. Therefore, this method transforms the double optimization problem into a single optimization one. The method is tested on 2 problems: a toy example and the Pumadyn regression Benchmark

DIAL UCLouvain

Mutual information and gamma test for input selection

Author: Amaury Lendasse
Jin Hao
Nima Reyhani
Yongnan Ji
Publication venue
Publication date: 01/01/2005
Field of study

Abstract. In this paper, input selection is performed using two different approaches. The first approach is based on the Gamma test. This test estimates the mean square error (MSE) that can be achieved without overfitting. The best set of inputs is the one that minimises the result of the Gamma test. The second method estimates the Mutual Information between a set of inputs and the output. The best set of inputs is the one that maximises the Mutual Information. Both methods are applied for the selection of the inputs for function approximation and time series prediction problems.

CiteSeerX

Methodology for long-term prediction of time series

Author: Amaury Lendasse
Antti Sorjamaa
Jin Hao
Nima Reyhani
Yongnan Ji
Publication venue
Publication date
Field of study

In this paper, a global methodology for the long-term prediction of time series is proposed. This methodology combines direct prediction strategy and sophisticated input selection criteria: k-nearest neighbors approximation method (k-NN), mutual information (MI) and nonparametric noise estimation (NNE). A global input selection strategy that combines forward selection, backward elimination (or pruning) and forward–backward selection is introduced. This methodology is used to optimize the three input selection criteria (k-NN, MI and NNE). The methodology is successfully applied to a real life benchmark: the Poland Electricity Load dataset. r 2007 Elsevier B.V. All rights reserved

CiteSeerX

2 Machine Learning Group,

Author: Amaury Lendasse
Louvain-la-neuve Belgique
Michel Verleysen
Nima Reyhani
Yongnan Ji
Publication venue
Publication date
Field of study

Abstract. This paper presents a new method for the selection of the two hyperparameters of Least Squares Support Vector Machine (LS-SVM) approximators with Gaussian Kernels. The two hyperparameters are the width σ of the Gaussian kernels and the regularization parameter λ. For different values of σ, a Nonparametric Noise Estimator (NNE) is introduced to estimate the variance of the noise on the outputs. The NNE allows the determination of the best λ for each given σ. A Leave-one-out methodology is then applied to select the best σ. Therefore, this method transforms the double optimization problem into a single optimization one. The method is tested on 2 problems: a toy example and the Pumadyn regression Benchmark

CiteSeerX