46 research outputs found
A geometric approach to archetypal analysis and non-negative matrix factorization
Archetypal analysis and non-negative matrix factorization (NMF) are staples
in a statisticians toolbox for dimension reduction and exploratory data
analysis. We describe a geometric approach to both NMF and archetypal analysis
by interpreting both problems as finding extreme points of the data cloud. We
also develop and analyze an efficient approach to finding extreme points in
high dimensions. For modern massive datasets that are too large to fit on a
single machine and must be stored in a distributed setting, our approach makes
only a small number of passes over the data. In fact, it is possible to obtain
the NMF or perform archetypal analysis with just two passes over the data.Comment: 36 pages, 13 figure
Learning Mixtures of Linear Classifiers
We consider a discriminative learning (regression) problem, whereby the
regression function is a convex combination of k linear classifiers. Existing
approaches are based on the EM algorithm, or similar techniques, without
provable guarantees. We develop a simple method based on spectral techniques
and a `mirroring' trick, that discovers the subspace spanned by the
classifiers' parameter vectors. Under a probabilistic assumption on the feature
vector distribution, we prove that this approach has nearly optimal statistical
efficiency