64,216 research outputs found
Flexible Parallel Algorithms for Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable function and a (block) separable nonsmooth, convex one. The
latter term is typically used to enforce structure in the solution as, for
example, in Lasso problems. Our framework is very flexible and includes both
fully parallel Jacobi schemes and Gauss-Seidel (Southwell-type) ones, as well
as virtually all possibilities in between (e.g., gradient- or Newton-type
methods) with only a subset of variables updated at each iteration. Our
theoretical convergence results improve on existing ones, and numerical results
show that the new method compares favorably to existing algorithms.Comment: submitted to IEEE ICASSP 201
A Bootstrap Metropolis-Hastings algorithm for Bayesian Analysis of Big Data
Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their compute-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this thesis, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-conquer method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively
A Bootstrap Metropolis-Hastings algorithm for Bayesian Analysis of Big Data
Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their compute-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this thesis, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-conquer method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively
Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable {(possibly nonconvex)} function and a nonsmooth (possibly
nonseparable), convex one. The latter term is usually employed to enforce
structure in the solution, typically sparsity. The main contribution of this
work is a novel \emph{parallel, hybrid random/deterministic} decomposition
scheme wherein, at each iteration, a subset of (block) variables is updated at
the same time by minimizing local convex approximations of the original
nonconvex function. To tackle with huge-scale problems, the (block) variables
to be updated are chosen according to a \emph{mixed random and deterministic}
procedure, which captures the advantages of both pure deterministic and random
update-based schemes. Almost sure convergence of the proposed scheme is
established. Numerical results show that on huge-scale problems the proposed
hybrid random/deterministic algorithm outperforms both random and deterministic
schemes.Comment: The order of the authors is alphabetica
Parallel Selective Algorithms for Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable (possibly nonconvex) function and a (block) separable
nonsmooth, convex one. The latter term is usually employed to enforce structure
in the solution, typically sparsity. Our framework is very flexible and
includes both fully parallel Jacobi schemes and Gauss- Seidel (i.e.,
sequential) ones, as well as virtually all possibilities "in between" with only
a subset of variables updated at each iteration. Our theoretical convergence
results improve on existing ones, and numerical results on LASSO, logistic
regression, and some nonconvex quadratic problems show that the new method
consistently outperforms existing algorithms.Comment: This work is an extended version of the conference paper that has
been presented at IEEE ICASSP'14. The first and the second author contributed
equally to the paper. This revised version contains new numerical results on
non convex quadratic problem
- …