3,830 research outputs found
Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees
Greedy optimization methods such as Matching Pursuit (MP) and Frank-Wolfe
(FW) algorithms regained popularity in recent years due to their simplicity,
effectiveness and theoretical guarantees. MP and FW address optimization over
the linear span and the convex hull of a set of atoms, respectively. In this
paper, we consider the intermediate case of optimization over the convex cone,
parametrized as the conic hull of a generic atom set, leading to the first
principled definitions of non-negative MP algorithms for which we give explicit
convergence rates and demonstrate excellent empirical performance. In
particular, we derive sublinear () convergence on general
smooth and convex objectives, and linear convergence () on
strongly convex objectives, in both cases for general sets of atoms.
Furthermore, we establish a clear correspondence of our algorithms to known
algorithms from the MP and FW literature. Our novel algorithms and analyses
target general atom sets and general objective functions, and hence are
directly applicable to a large variety of learning settings.Comment: NIPS 201
Block Coordinate Descent for Sparse NMF
Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data
analysis. An important variant is the sparse NMF problem which arises when we
explicitly require the learnt features to be sparse. A natural measure of
sparsity is the L norm, however its optimization is NP-hard. Mixed norms,
such as L/L measure, have been shown to model sparsity robustly, based
on intuitive attributes that such measures need to satisfy. This is in contrast
to computationally cheaper alternatives such as the plain L norm. However,
present algorithms designed for optimizing the mixed norm L/L are slow
and other formulations for sparse NMF have been proposed such as those based on
L and L norms. Our proposed algorithm allows us to solve the mixed norm
sparsity constraints while not sacrificing computation time. We present
experimental evidence on real-world datasets that shows our new algorithm
performs an order of magnitude faster compared to the current state-of-the-art
solvers optimizing the mixed norm and is suitable for large-scale datasets
- …