9,505 research outputs found
Nonparametric Estimation of Multi-View Latent Variable Models
Spectral methods have greatly advanced the estimation of latent variable
models, generating a sequence of novel and efficient algorithms with strong
theoretical guarantees. However, current spectral algorithms are largely
restricted to mixtures of discrete or Gaussian distributions. In this paper, we
propose a kernel method for learning multi-view latent variable models,
allowing each mixture component to be nonparametric. The key idea of the method
is to embed the joint distribution of a multi-view latent variable into a
reproducing kernel Hilbert space, and then the latent parameters are recovered
using a robust tensor power method. We establish that the sample complexity for
the proposed method is quadratic in the number of latent components and is a
low order polynomial in the other relevant parameters. Thus, our non-parametric
tensor approach to learning latent variable models enjoys good sample and
computational efficiencies. Moreover, the non-parametric tensor power method
compares favorably to EM algorithm and other existing spectral algorithms in
our experiments
Spectral Sequence Motif Discovery
Sequence discovery tools play a central role in several fields of
computational biology. In the framework of Transcription Factor binding
studies, motif finding algorithms of increasingly high performance are required
to process the big datasets produced by new high-throughput sequencing
technologies. Most existing algorithms are computationally demanding and often
cannot support the large size of new experimental data. We present a new motif
discovery algorithm that is built on a recent machine learning technique,
referred to as Method of Moments. Based on spectral decompositions, this method
is robust under model misspecification and is not prone to locally optimal
solutions. We obtain an algorithm that is extremely fast and designed for the
analysis of big sequencing data. In a few minutes, we can process datasets of
hundreds of thousand sequences and extract motif profiles that match those
computed by various state-of-the-art algorithms.Comment: 20 pages, 3 figures, 1 tabl
- …