8,276 research outputs found
Bregman Alternating Direction Method of Multipliers
The mirror descent algorithm (MDA) generalizes gradient descent by using a
Bregman divergence to replace squared Euclidean distance. In this paper, we
similarly generalize the alternating direction method of multipliers (ADMM) to
Bregman ADMM (BADMM), which allows the choice of different Bregman divergences
to exploit the structure of problems. BADMM provides a unified framework for
ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM.
We establish the global convergence and the iteration complexity for
BADMM. In some cases, BADMM can be faster than ADMM by a factor of
. In solving the linear program of mass transportation problem,
BADMM leads to massive parallelism and can easily run on GPU. BADMM is several
times faster than highly optimized commercial software Gurobi
Scalable Stochastic Alternating Direction Method of Multipliers
Stochastic alternating direction method of multipliers (ADMM), which visits
only one sample or a mini-batch of samples each time, has recently been proved
to achieve better performance than batch ADMM. However, most stochastic methods
can only achieve a convergence rate on general convex
problems,where T is the number of iterations. Hence, these methods are not
scalable with respect to convergence rate (computation cost). There exists only
one stochastic method, called SA-ADMM, which can achieve convergence rate
on general convex problems. However, an extra memory is needed for
SA-ADMM to store the historic gradients on all samples, and thus it is not
scalable with respect to storage cost. In this paper, we propose a novel
method, called scalable stochastic ADMM(SCAS-ADMM), for large-scale
optimization and learning problems. Without the need to store the historic
gradients, SCAS-ADMM can achieve the same convergence rate as the best
stochastic method SA-ADMM and batch ADMM on general convex problems.
Experiments on graph-guided fused lasso show that SCAS-ADMM can achieve
state-of-the-art performance in real application
Adaptive Stochastic Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers (ADMM) has been studied for
years. The traditional ADMM algorithm needs to compute, at each iteration, an
(empirical) expected loss function on all training examples, resulting in a
computational complexity proportional to the number of training examples. To
reduce the time complexity, stochastic ADMM algorithms were proposed to replace
the expected function with a random loss function associated with one uniformly
drawn example plus a Bregman divergence. The Bregman divergence, however, is
derived from a simple second order proximal function, the half squared norm,
which could be a suboptimal choice.
In this paper, we present a new family of stochastic ADMM algorithms with
optimal second order proximal functions, which produce a new family of adaptive
subgradient methods. We theoretically prove that their regret bounds are as
good as the bounds which could be achieved by the best proximal function that
can be chosen in hindsight. Encouraging empirical results on a variety of
real-world datasets confirm the effectiveness and efficiency of the proposed
algorithms.Comment: 13 page
Fast Stochastic Alternating Direction Method of Multipliers
In this paper, we propose a new stochastic alternating direction method of
multipliers (ADMM) algorithm, which incrementally approximates the full
gradient in the linearized ADMM formulation. Besides having a low per-iteration
complexity as existing stochastic ADMM algorithms, the proposed algorithm
improves the convergence rate on convex problems from
to , where is the number of iterations. This matches the
convergence rate of the batch ADMM algorithm, but without the need to visit all
the samples in each iteration. Experiments on the graph-guided fused lasso
demonstrate that the new algorithm is significantly faster than
state-of-the-art stochastic and batch ADMM algorithms
An inertial alternating direction method of multipliers
In the context of convex optimization problems in Hilbert spaces, we induce
inertial effects into the classical ADMM numerical scheme and obtain in this
way so-called inertial ADMM algorithms, the convergence properties of which we
investigate into detail. To this aim we make use of the inertial version of the
Douglas-Rachford splitting method for monotone inclusion problems recently
introduced in [12], in the context of concomitantly solving a convex
minimization problem and its Fenchel dual. The convergence of both sequences of
the generated iterates and of the objective function values is addressed. We
also show how the obtained results can be extended to the treating of convex
minimization problems having as objective a finite sum of convex functions
An Accelerated Linearized Alternating Direction Method of Multipliers
We present a novel framework, namely AADMM, for acceleration of linearized
alternating direction method of multipliers (ADMM). The basic idea of AADMM is
to incorporate a multi-step acceleration scheme into linearized ADMM. We
demonstrate that for solving a class of convex composite optimization with
linear constraints, the rate of convergence of AADMM is better than that of
linearized ADMM, in terms of their dependence on the Lipschitz constant of the
smooth component. Moreover, AADMM is capable to deal with the situation when
the feasible region is unbounded, as long as the corresponding saddle point
problem has a solution. A backtracking algorithm is also proposed for practical
performance
Distributed Event Localization via Alternating Direction Method of Multipliers
This paper addresses the problem of distributed event localization using
noisy range measurements with respect to sensors with known positions. Event
localization is fundamental in many wireless sensor network applications such
as homeland security, law enforcement, and environmental studies. However, most
existing distributed algorithms require the target event to be within the
convex hull of the deployed sensors. Based on the alternating direction method
of multipliers (ADMM), we propose two scalable distributed algorithms named
GS-ADMM and J-ADMM which do not require the target event to be within the
convex hull of the deployed sensors. More specifically, the two algorithms can
be implemented in a scenario in which the entire sensor network is divided into
several clusters with cluster heads collecting measurements within each cluster
and exchanging intermediate computation information to achieve localization
consistency (consensus) across all clusters. This scenario is important in many
applications such as homeland security and law enforcement. Simulation results
confirm effectiveness of the proposed algorithms.Comment: accepted to IEEE Transactions on Mobile Computin
Alternating Direction Method of Multipliers for Linear Inverse Problems
In this paper we propose an iterative method using alternating direction
method of multipliers (ADMM) strategy to solve linear inverse problems in
Hilbert spaces with general convex penalty term. When the data is given
exactly, we give a convergence analysis of our ADMM algorithm without assuming
the existence of Lagrange multiplier. In case the data contains noise, we show
that our method is a regularization method as long as it is terminated by a
suitable stopping rule. Various numerical simulations are performed to test the
efficiency of the method
Self Equivalence of the Alternating Direction Method of Multipliers
The alternating direction method of multipliers (ADM or ADMM) breaks a
complex optimization problem into much simpler subproblems. The ADM algorithms
are typically short and easy to implement yet exhibit (nearly) state-of-the-art
performance for large-scale optimization problems.
To apply ADM, we first formulate a given problem into the "ADM-ready" form,
so the final algorithm depends on the formulation. A problem like
\mbox{minimize}_\mathbf{x} u(\mathbf{x}) + v(\mathbf{C}\mathbf{x}) has six
different "ADM-ready" formulations. They can be in the primal or dual forms,
and they differ by how dummy variables are introduced. To each "ADM-ready"
formulation, ADM can be applied in two different orders depending on how the
primal variables are updated. Finally, we get twelve different ADM algorithms!
How do they compare to each other? Which algorithm should one choose?
In this paper, we show that many of the different ways of applying ADM are
equivalent. Specifically, we show that ADM applied to a primal formulation is
equivalent to ADM applied to its Lagrange dual; ADM is equivalent to a
primal-dual algorithm applied to the saddle-point formulation of the same
problem. These results are surprising since the primal and dual variables in
ADM are seemingly treated very differently, and some previous work exhibit
preferences in one over the other on specific problems. In addition, when one
of the two objective functions is quadratic, possibly subject to an affine
constraint, we show that swapping the update order of the two primal variables
in ADM gives the same algorithm. These results identify the few truly different
ADM algorithms for a problem, which generally have different forms of
subproblems from which it is easy to pick one with the most computationally
friendly subproblems.Comment: 29 page
DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers
This paper considers decentralized consensus optimization problems where
nodes of a network have access to different summands of a global objective
function. Nodes cooperate to minimize the global objective by exchanging
information with neighbors only. A decentralized version of the alternating
directions method of multipliers (DADMM) is a common method for solving this
category of problems. DADMM exhibits linear convergence rate to the optimal
objective but its implementation requires solving a convex optimization problem
at each iteration. This can be computationally costly and may result in large
overall convergence times. The decentralized quadratically approximated ADMM
algorithm (DQM), which minimizes a quadratic approximation of the objective
function that DADMM minimizes at each iteration, is proposed here. The
consequent reduction in computational time is shown to have minimal effect on
convergence properties. Convergence still proceeds at a linear rate with a
guaranteed constant that is asymptotically equivalent to the DADMM linear
convergence rate constant. Numerical results demonstrate advantages of DQM
relative to DADMM and other alternatives in a logistic regression problem.Comment: 13 page
- …