Search CORE

664 research outputs found

Overlearning in marginal distribution-based ICA: analysis and solutions

Author: Särelä Mr Jaakko
Publication venue: MIT press
Publication date: 01/12/2003
Field of study

The present paper is written as a word of caution, with users of independent component analysis (ICA) in mind, to overlearning phenomena that are often observed.\\ We consider two types of overlearning, typical to high-order statistics based ICA. These algorithms can be seen to maximise the negentropy of the source estimates. The first kind of overlearning results in the generation of spike-like signals, if there are not enough samples in the data or there is a considerable amount of noise present. It is argued that, if the data has power spectrum characterised by

1/f

curve, we face a more severe problem, which cannot be solved inside the strict ICA model. This overlearning is better characterised by bumps instead of spikes. Both overlearning types are demonstrated in the case of artificial signals as well as magnetoencephalograms (MEG). Several methods are suggested to circumvent both types, either by making the estimation of the ICA model more robust or by including further modelling of the data

Rethinking LDA: moment matching for discrete ICA

Author: Bach Francis
Lacoste-Julien Simon
Podosinnikova Anastasia
Publication venue
Publication date: 05/11/2015
Field of study

We consider moment matching techniques for estimation in Latent Dirichlet Allocation (LDA). By drawing explicit links between LDA and discrete versions of independent component analysis (ICA), we first derive a new set of cumulant-based tensors, with an improved sample complexity. Moreover, we reuse standard ICA techniques such as joint diagonalization of tensors to improve over existing methods based on the tensor power method. In an extensive set of experiments on both synthetic and real datasets, we show that our new combination of tensors and orthogonal joint diagonalization techniques outperforms existing moment matching methods.Comment: 30 pages; added plate diagrams and clarifications, changed style, corrected typos, updated figures. in Proceedings of the 29-th Conference on Neural Information Processing Systems (NIPS), 201

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Heavy-tailed Independent Component Analysis

Author: Anderson Joseph
Goyal Navin
Nandi Anupama
Rademacher Luis
Publication venue
Publication date: 02/09/2015
Field of study

Independent component analysis (ICA) is the problem of efficiently recovering a matrix

A \in \mathbb{R}^{n\times n}

from i.i.d. observations of

X=AS

where

S \in \mathbb{R}^n

is a random vector with mutually independent coordinates. This problem has been intensively studied, but all existing efficient algorithms with provable guarantees require that the coordinates

S_i

have finite fourth moments. We consider the heavy-tailed ICA problem where we do not make this assumption, about the second moment. This problem also has received considerable attention in the applied literature. In the present work, we first give a provably efficient algorithm that works under the assumption that for constant

\gamma > 0

, each

S_i

has finite

(1+\gamma)

-moment, thus substantially weakening the moment requirement condition for the ICA problem to be solvable. We then give an algorithm that works under the assumption that matrix

A

has orthogonal columns but requires no moment assumptions. Our techniques draw ideas from convex geometry and exploit standard properties of the multivariate spherical Gaussian distribution in a novel way.Comment: 30 page

arXiv.org e-Print Archive

Crossref

Fourier PCA and Robust Tensor Decomposition

Author: Anandkumar A.
Anandkumar A.
Anderson J.
Arora S.
Belkin M.
Belkin M.
Cardoso J.
Chaudhuri K.
Comon P.
Dasgupta S.
Hyvärinen A.
Kannan R.
Publication venue
Publication date: 27/06/2014
Field of study

Fourier PCA is Principal Component Analysis of a matrix obtained from higher order derivatives of the logarithm of the Fourier transform of a distribution.We make this method algorithmic by developing a tensor decomposition method for a pair of tensors sharing the same vectors in rank-

1

decompositions. Our main application is the first provably polynomial-time algorithm for underdetermined ICA, i.e., learning an

n \times m

matrix

A

from observations

y=Ax

where

x

is drawn from an unknown product distribution with arbitrary non-Gaussian components. The number of component distributions

m

can be arbitrarily higher than the dimension

n

and the columns of

A

only need to satisfy a natural and efficiently verifiable nondegeneracy condition. As a second application, we give an alternative algorithm for learning mixtures of spherical Gaussians with linearly independent means. These results also hold in the presence of Gaussian noise.Comment: Extensively revised; details added; minor errors corrected; exposition improve

arXiv.org e-Print Archive

CiteSeerX

Crossref