176 research outputs found
Distributed Big-Data Optimization via Block Communications
We study distributed multi-agent large-scale optimization problems, wherein
the cost function is composed of a smooth possibly nonconvex sum-utility plus a
DC (Difference-of-Convex) regularizer. We consider the scenario where the
dimension of the optimization variables is so large that optimizing and/or
transmitting the entire set of variables could cause unaffordable computation
and communication overhead. To address this issue, we propose the first
distributed algorithm whereby agents optimize and communicate only a portion of
their local variables. The scheme hinges on successive convex approximation
(SCA) to handle the nonconvexity of the objective function, coupled with a
novel block-signal tracking scheme, aiming at locally estimating the average of
the agents' gradients. Asymptotic convergence to stationary solutions of the
nonconvex problem is established. Numerical results on a sparse regression
problem show the effectiveness of the proposed algorithm and the impact of the
block size on its practical convergence speed and communication cost
Distributed big-data optimization via block communications
We study distributed multi-agent large-scale optimization problems, wherein the cost function is composed of a smooth possibly nonconvex sum-utility plus a DC (Difference-of-Convex) regularizer. We consider the scenario where the dimension of the optimization variables is so large that optimizing and/or transmitting the entire set of variables could cause unaffordable computation and communication overhead. To address this issue, we propose the first distributed algorithm whereby agents optimize and communicate only a portion of their local variables. The scheme hinges on successive convex approximation (SCA) to handle the nonconvexity of the objective function, coupled with a novel block- signal tracking scheme, aiming at locally estimating the average of the agents\u2019 gradients. Asymptotic convergence to stationary solutions of the nonconvex problem is established. Numerical results on a sparse regression problem show the effectiveness of the proposed algorithm and the impact of the block size on its practical convergence speed and communication cost
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
Riemannian submanifold optimization with momentum is computationally
challenging because, to ensure that the iterates remain on the submanifold, we
often need to solve difficult differential equations. Here, we simplify such
difficulties for a class of structured symmetric positive-definite matrices
with the affine-invariant metric. We do so by proposing a generalized version
of the Riemannian normal coordinates that dynamically orthonormalizes the
metric and locally converts the problem into an unconstrained problem in the
Euclidean space. We use our approach to simplify existing approaches for
structured covariances and develop matrix-inverse-free -order
optimizers for deep learning with low precision by using only matrix
multiplications. Code: https://github.com/yorkerlin/StructuredNGD-DLComment: An updated version of the ICML 2023 paper. Updated the main text and
added more numerical results for DNNs including a new baseline method and
improving existing baseline method
Optimization of a Geman-McClure like criterion for sparse signal deconvolution
International audienceThis paper deals with the problem of recovering a sparse unknown signal from a set of observations. The latter are obtained by convolution of the original signal and corruption with additive noise. We tackle the problem by minimizing a least-squares fit criterion penalized by a Geman-McClure like potential. The resulting criterion is a rational function, which makes it possible to formulate its minimization as a generalized problem of moments for which a hierarchy of semidefinite programming relaxations can be proposed. These convex relaxations yield a monotone sequence of values which converges to the global optimum. To overcome the computational limitations due to the large number of involved variables, a stochastic block-coordinate descent method is proposed. The algorithm has been implemented and shows promising result
Joint covariate selection and joint subspace selection for multiple classification problems
We address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. By penalizing the sum of â„“2-norms of the blocks of coefficients associated with each covariate across different classification problems, similar sparsity patterns in all models are encouraged. To take computational advantage of the sparsity of solutions at high regularization levels, we propose a blockwise path-following scheme that approximately traces the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems. We also show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace. We present theoretical results showing that this random projection approach converges to the solution yielded by trace-norm regularization. Finally, we present a variety of experimental results exploring joint covariate selection and joint subspace selection, comparing the path-following approach to competing algorithms in terms of prediction accuracy and running time
- …