477 research outputs found
Walking in the Shadow: A New Perspective on Descent Directions for Constrained Minimization
Descent directions such as movement towards Frank-Wolfe vertices, away steps,
in-face away steps and pairwise directions have been an important design
consideration in conditional gradient descent (CGD) variants. In this work, we
attempt to demystify the impact of movement in these directions towards
attaining constrained minimizers. The best local direction of descent is the
directional derivative of the projection of the gradient, which we refer to as
the of the gradient. We show that the continuous-time
dynamics of moving in the shadow are equivalent to those of PGD however
non-trivial to discretize. By projecting gradients in PGD, one not only ensures
feasibility but is also able to "wrap" around the convex region. We show that
Frank-Wolfe (FW) vertices in fact recover the maximal wrap one can obtain by
projecting gradients, thus providing a new perspective on these steps. We also
claim that the shadow steps give the best direction of descent emanating from
the convex hull of all possible away-steps. Viewing PGD movements in terms of
shadow steps gives linear convergence, dependent on the number of faces. We
combine these insights into a novel - method that uses FW
steps (i.e., wrap around the polytope) and shadow steps (i.e., optimal local
descent direction), while enjoying linear convergence. Our analysis develops
properties of the curve formed by projecting a line on a polytope, which may be
of independent interest, while providing a unifying view of various descent
directions in the CGD literature
Recursive Frank-Wolfe algorithms
In the last decade there has been a resurgence of interest in Frank-Wolfe
(FW) style methods for optimizing a smooth convex function over a polytope.
Examples of recently developed techniques include {\em Decomposition-invariant
Conditional Gradient} (DiCG), {\em Blended Condition Gradient} (BCG), and {\em
Frank-Wolfe with in-face directions} (IF-FW) methods. We introduce two
extensions of these techniques. First, we augment DiCG with the {\em working
set} strategy, and show how to optimize over the working set using {\em shadow
simplex steps}. Second, we generalize in-face Frank-Wolfe directions to
polytopes in which faces cannot be efficiently computed, and also describe a
generic recursive procedure that can be used in conjunction with several
FW-style techniques. Experimental results indicate that these extensions are
capable of speeding up original algorithms by orders of magnitude for certain
applications
Avoiding bad steps in Frank Wolfe variants
The analysis of Frank Wolfe (FW) variants is often complicated by the
presence of different kinds of "good" and "bad" steps. In this article we aim
to simplify the convergence analysis of some of these variants by getting rid
of such a distinction between steps, and to improve existing rates by ensuring
a sizable decrease of the objective at each iteration. In order to do this, we
define the Short Step Chain (SSC) procedure, which skips gradient computations
in consecutive short steps until proper stopping conditions are satisfied. This
technique allows us to give a unified analysis and converge rates in the
general smooth non convex setting, as well as a linear convergence rate under a
Kurdyka-Lojasiewicz (KL) property. While this setting has been widely studied
for proximal gradient type methods, to our knowledge, it has not been analyzed
before for the Frank Wolfe variants under study. An angle condition, ensuring
that the directions selected by the methods have the steepest slope possible up
to a constant, is used to carry out our analysis. We prove that this condition
is satisfied on polytopes by the away step Frank-Wolfe (AFW), the pairwise
Frank-Wolfe (PFW), and the Frank-Wolfe method with in face directions (FDFW).Comment: See arXiv:2008.09781 for an extended version of the pape
Conditional Gradient Methods
The purpose of this survey is to serve both as a gentle introduction and a
coherent overview of state-of-the-art Frank--Wolfe algorithms, also called
conditional gradient algorithms, for function minimization. These algorithms
are especially useful in convex optimization when linear optimization is
cheaper than projections.
The selection of the material has been guided by the principle of
highlighting crucial ideas as well as presenting new approaches that we believe
might become important in the future, with ample citations even of old works
imperative in the development of newer methods. Yet, our selection is sometimes
biased, and need not reflect consensus of the research community, and we have
certainly missed recent important contributions. After all the research area of
Frank--Wolfe is very active, making it a moving target. We apologize sincerely
in advance for any such distortions and we fully acknowledge: We stand on the
shoulder of giants.Comment: 238 pages with many figures. The FrankWolfe.jl Julia package
(https://github.com/ZIB-IOL/FrankWolfe.jl) providces state-of-the-art
implementations of many Frank--Wolfe method
- …