65 research outputs found
Fourier PCA and Robust Tensor Decomposition
Fourier PCA is Principal Component Analysis of a matrix obtained from higher
order derivatives of the logarithm of the Fourier transform of a
distribution.We make this method algorithmic by developing a tensor
decomposition method for a pair of tensors sharing the same vectors in rank-
decompositions. Our main application is the first provably polynomial-time
algorithm for underdetermined ICA, i.e., learning an matrix
from observations where is drawn from an unknown product
distribution with arbitrary non-Gaussian components. The number of component
distributions can be arbitrarily higher than the dimension and the
columns of only need to satisfy a natural and efficiently verifiable
nondegeneracy condition. As a second application, we give an alternative
algorithm for learning mixtures of spherical Gaussians with linearly
independent means. These results also hold in the presence of Gaussian noise.Comment: Extensively revised; details added; minor errors corrected;
exposition improve
Max vs Min: Tensor Decomposition and ICA with nearly Linear Sample Complexity
We present a simple, general technique for reducing the sample complexity of
matrix and tensor decomposition algorithms applied to distributions. We use the
technique to give a polynomial-time algorithm for standard ICA with sample
complexity nearly linear in the dimension, thereby improving substantially on
previous bounds. The analysis is based on properties of random polynomials,
namely the spacings of an ensemble of polynomials. Our technique also applies
to other applications of tensor decompositions, including spherical Gaussian
mixture models
Heavy-tailed Independent Component Analysis
Independent component analysis (ICA) is the problem of efficiently recovering
a matrix from i.i.d. observations of
where is a random vector with mutually independent
coordinates. This problem has been intensively studied, but all existing
efficient algorithms with provable guarantees require that the coordinates
have finite fourth moments. We consider the heavy-tailed ICA problem
where we do not make this assumption, about the second moment. This problem
also has received considerable attention in the applied literature. In the
present work, we first give a provably efficient algorithm that works under the
assumption that for constant , each has finite
-moment, thus substantially weakening the moment requirement
condition for the ICA problem to be solvable. We then give an algorithm that
works under the assumption that matrix has orthogonal columns but requires
no moment assumptions. Our techniques draw ideas from convex geometry and
exploit standard properties of the multivariate spherical Gaussian distribution
in a novel way.Comment: 30 page
Probabilistic Neural Network based Approach for Handwritten Character Recognition
In this paper, recognition system for totally unconstrained handwritten characters for south Indian language of Kannada is proposed. The proposed feature extraction technique is based on Fourier Transform and well known Principal Component Analysis (PCA). The system trains the appropriate frequency band images followed by PCA feature extraction scheme. For subsequent classification technique, Probabilistic Neural Network (PNN) is used. The proposed system is tested on large database containing Kannada characters and also tested on standard COIL-20 object database and the results were found to be better compared to standard techniques
Overcomplete Independent Component Analysis via SDP
We present a novel algorithm for overcomplete independent components analysis
(ICA), where the number of latent sources k exceeds the dimension p of observed
variables. Previous algorithms either suffer from high computational complexity
or make strong assumptions about the form of the mixing matrix. Our algorithm
does not make any sparsity assumption yet enjoys favorable computational and
theoretical properties. Our algorithm consists of two main steps: (a)
estimation of the Hessians of the cumulant generating function (as opposed to
the fourth and higher order cumulants used by most algorithms) and (b) a novel
semi-definite programming (SDP) relaxation for recovering a mixing component.
We show that this relaxation can be efficiently solved with a projected
accelerated gradient descent method, which makes the whole algorithm
computationally practical. Moreover, we conjecture that the proposed program
recovers a mixing component at the rate k < p^2/4 and prove that a mixing
component can be recovered with high probability when k < (2 - epsilon) p log p
when the original components are sampled uniformly at random on the hyper
sphere. Experiments are provided on synthetic data and the CIFAR-10 dataset of
real images.Comment: Appears in: Proceedings of the 22nd International Conference on
Artificial Intelligence and Statistics (AISTATS 2019). 21 page
- …