122 research outputs found
Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes
Given a multivariate data set, sparse principal component analysis (SPCA)
aims to extract several linear combinations of the variables that together
explain the variance in the data as much as possible, while controlling the
number of nonzero loadings in these combinations. In this paper we consider 8
different optimization formulations for computing a single sparse loading
vector; these are obtained by combining the following factors: we employ two
norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1),
which are used in two different ways (constraint, penalty). Three of our
formulations, notably the one with L0 constraint and L1 variance, have not been
considered in the literature. We give a unifying reformulation which we propose
to solve via a natural alternating maximization (AM) method. We show the the AM
method is nontrivially equivalent to GPower (Journ\'{e}e et al; JMLR
11:517--553, 2010) for all our formulations. Besides this, we provide 24
efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster)
for each of the 8 problems. Parallelism in the methods is aimed at i) speeding
up computations (our GPU code can be 100 times faster than an efficient serial
code written in C++), ii) obtaining solutions explaining more variance and iii)
dealing with big data problems (our cluster code is able to solve a 357 GB
problem in about a minute).Comment: 29 pages, 9 tables, 7 figures (the paper is accompanied by a release
of the open-source code '24am'
Iteration Complexity of Randomized Block-Coordinate Descent Methods for Minimizing a Composite Function
In this paper we develop a randomized block-coordinate descent method for
minimizing the sum of a smooth and a simple nonsmooth block-separable convex
function and prove that it obtains an -accurate solution with
probability at least in at most iterations, where is the number of blocks. For strongly
convex functions the method converges linearly. This extends recent results of
Nesterov [Efficiency of coordinate descent methods on huge-scale optimization
problems, CORE Discussion Paper #2010/2], which cover the smooth case, to
composite minimization, while at the same time improving the complexity by the
factor of 4 and removing from the logarithmic term. More
importantly, in contrast with the aforementioned work in which the author
achieves the results by applying the method to a regularized version of the
objective function with an unknown scaling factor, we show that this is not
necessary, thus achieving true iteration complexity bounds. In the smooth case
we also allow for arbitrary probability vectors and non-Euclidean norms.
Finally, we demonstrate numerically that the algorithm is able to solve
huge-scale -regularized least squares and support vector machine
problems with a billion variables.Comment: 33 pages, 7 figures, 10 table
- …