8,322 research outputs found
Sketching for Large-Scale Learning of Mixture Models
Learning parameters from voluminous data can be prohibitive in terms of
memory and computational requirements. We propose a "compressive learning"
framework where we estimate model parameters from a sketch of the training
data. This sketch is a collection of generalized moments of the underlying
probability distribution of the data. It can be computed in a single pass on
the training set, and is easily computable on streams or distributed datasets.
The proposed framework shares similarities with compressive sensing, which aims
at drastically reducing the dimension of high-dimensional signals while
preserving the ability to reconstruct them. To perform the estimation task, we
derive an iterative algorithm analogous to sparse reconstruction algorithms in
the context of linear inverse problems. We exemplify our framework with the
compressive estimation of a Gaussian Mixture Model (GMM), providing heuristics
on the choice of the sketching procedure and theoretical guarantees of
reconstruction. We experimentally show on synthetic data that the proposed
algorithm yields results comparable to the classical Expectation-Maximization
(EM) technique while requiring significantly less memory and fewer computations
when the number of database elements is large. We further demonstrate the
potential of the approach on real large-scale data (over 10 8 training samples)
for the task of model-based speaker verification. Finally, we draw some
connections between the proposed framework and approximate Hilbert space
embedding of probability distributions using random features. We show that the
proposed sketching operator can be seen as an innovative method to design
translation-invariant kernels adapted to the analysis of GMMs. We also use this
theoretical framework to derive information preservation guarantees, in the
spirit of infinite-dimensional compressive sensing
Random projections for Bayesian regression
This article deals with random projections applied as a data reduction
technique for Bayesian regression analysis. We show sufficient conditions under
which the entire -dimensional distribution is approximately preserved under
random projections by reducing the number of data points from to in the case . Under mild
assumptions, we prove that evaluating a Gaussian likelihood function based on
the projected data instead of the original data yields a
-approximation in terms of the Wasserstein
distance. Our main result shows that the posterior distribution of Bayesian
linear regression is approximated up to a small error depending on only an
-fraction of its defining parameters. This holds when using
arbitrary Gaussian priors or the degenerate case of uniform distributions over
for . Our empirical evaluations involve different
simulated settings of Bayesian linear regression. Our experiments underline
that the proposed method is able to recover the regression model up to small
error while considerably reducing the total running time
Fast and Guaranteed Tensor Decomposition via Sketching
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in
statistical learning of latent variable models and in data mining. In this
paper, we propose fast and randomized tensor CP decomposition algorithms based
on sketching. We build on the idea of count sketches, but introduce many novel
ideas which are unique to tensors. We develop novel methods for randomized
computation of tensor contractions via FFTs, without explicitly forming the
tensors. Such tensor contractions are encountered in decomposition methods such
as tensor power iterations and alternating least squares. We also design novel
colliding hashes for symmetric tensors to further save time in computing the
sketches. We then combine these sketching ideas with existing whitening and
tensor power iterative techniques to obtain the fastest algorithm on both
sparse and dense tensors. The quality of approximation under our method does
not depend on properties such as sparsity, uniformity of elements, etc. We
apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information
Processing Systems (NIPS), held at Montreal, Canada in 201
- âŠ