109 research outputs found
Alternating Randomized Block Coordinate Descent
Block-coordinate descent algorithms and alternating minimization methods are
fundamental optimization algorithms and an important primitive in large-scale
optimization and machine learning. While various block-coordinate-descent-type
methods have been studied extensively, only alternating minimization -- which
applies to the setting of only two blocks -- is known to have convergence time
that scales independently of the least smooth block. A natural question is
then: is the setting of two blocks special?
We show that the answer is "no" as long as the least smooth block can be
optimized exactly -- an assumption that is also needed in the setting of
alternating minimization. We do so by introducing a novel algorithm AR-BCD,
whose convergence time scales independently of the least smooth (possibly
non-smooth) block. The basic algorithm generalizes both alternating
minimization and randomized block coordinate (gradient) descent, and we also
provide its accelerated version -- AAR-BCD. As a special case of AAR-BCD, we
obtain the first nontrivial accelerated alternating minimization algorithm.Comment: Version 1 appeared Proc. ICML'18. v1 -> v2: added remarks about how
accelerated alternating minimization follows directly from the results that
appeared in ICML'18; no new technical results were needed for thi
Alternating randomized block coordinate descent
Block-coordinate descent algorithms and alternating minimization methods are fundamental optimization algorithms and an important primitive in large-scale optimization and machine learning. While various block-coordinate-descent-type methods have been studied extensively, only alternating minimization -- which applies to the setting of only two blocks -- is known to have convergence time that scales independently of the least smooth block. A natural question is then: is the setting of two blocks special?
We show that the answer is "no" as long as the least smooth block can be optimized exactly -- an assumption that is also needed in the setting of alternating minimization. We do so by introducing a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block. The basic algorithm generalizes both alternating minimization and randomized block coordinate (gradient) descent, and we also provide its accelerated version -- AAR-BCD. As a special case of AAR-BCD, we obtain the first nontrivial accelerated alternating minimization algorithm.Published versio
On the Minimization of Convex Functionals of Probability Distributions Under Band Constraints
The problem of minimizing convex functionals of probability distributions is
solved under the assumption that the density of every distribution is bounded
from above and below. A system of sufficient and necessary first-order
optimality conditions as well as a bound on the optimality gap of feasible
candidate solutions are derived. Based on these results, two numerical
algorithms are proposed that iteratively solve the system of optimality
conditions on a grid of discrete points. Both algorithms use a block coordinate
descent strategy and terminate once the optimality gap falls below the desired
tolerance. While the first algorithm is conceptually simpler and more
efficient, it is not guaranteed to converge for objective functions that are
not strictly convex. This shortcoming is overcome in the second algorithm,
which uses an additional outer proximal iteration, and, which is proven to
converge under mild assumptions. Two examples are given to demonstrate the
theoretical usefulness of the optimality conditions as well as the high
efficiency and accuracy of the proposed numerical algorithms.Comment: 13 pages, 5 figures, 2 tables, published in the IEEE Transactions on
Signal Processing. In previous versions, the example in Section VI.B
contained some mistakes and inaccuracies, which have been fixed in this
versio
- …