1,751 research outputs found
Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets
The Frank-Wolfe method (a.k.a. conditional gradient algorithm) for smooth
optimization has regained much interest in recent years in the context of large
scale optimization and machine learning. A key advantage of the method is that
it avoids projections - the computational bottleneck in many applications -
replacing it by a linear optimization step. Despite this advantage, the known
convergence rates of the FW method fall behind standard first order methods for
most settings of interest. It is an active line of research to derive faster
linear optimization-based algorithms for various settings of convex
optimization.
In this paper we consider the special case of optimization over strongly
convex sets, for which we prove that the vanila FW method converges at a rate
of . This gives a quadratic improvement in convergence rate
compared to the general case, in which convergence is of the order
, and known to be tight. We show that various balls induced by
norms, Schatten norms and group norms are strongly convex on one hand
and on the other hand, linear optimization over these sets is straightforward
and admits a closed-form solution. We further show how several previous
fast-rate results for the FW method follow easily from our analysis
Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls
We propose a rank- variant of the classical Frank-Wolfe algorithm to solve
convex optimization over a trace-norm ball. Our algorithm replaces the top
singular-vector computation (-SVD) in Frank-Wolfe with a top-
singular-vector computation (-SVD), which can be done by repeatedly applying
-SVD times. Alternatively, our algorithm can be viewed as a rank-
restricted version of projected gradient descent. We show that our algorithm
has a linear convergence rate when the objective function is smooth and
strongly convex, and the optimal solution has rank at most . This improves
the convergence rate and the total time complexity of the Frank-Wolfe method
and its variants.Comment: In NIPS 201
Riemannian Optimization via Frank-Wolfe Methods
We study projection-free methods for constrained Riemannian optimization. In
particular, we propose the Riemannian Frank-Wolfe (RFW) method. We analyze
non-asymptotic convergence rates of RFW to an optimum for (geodesically) convex
problems, and to a critical point for nonconvex objectives. We also present a
practical setting under which RFW can attain a linear convergence rate. As a
concrete example, we specialize Rfw to the manifold of positive definite
matrices and apply it to two tasks: (i) computing the matrix geometric mean
(Riemannian centroid); and (ii) computing the Bures-Wasserstein barycenter.
Both tasks involve geodesically convex interval constraints, for which we show
that the Riemannian "linear oracle" required by RFW admits a closed-form
solution; this result may be of independent interest. We further specialize RFW
to the special orthogonal group and show that here too, the Riemannian "linear
oracle" can be solved in closed form. Here, we describe an application to the
synchronization of data matrices (Procrustes problem). We complement our
theoretical results with an empirical comparison of Rfw against
state-of-the-art Riemannian optimization methods and observe that RFW performs
competitively on the task of computing Riemannian centroids.Comment: Under Review. Largely revised version, including an extended
experimental section and an application to the special orthogonal group and
the Procrustes proble
Frank-Wolfe Algorithms for Saddle Point Problems
We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained
smooth convex-concave saddle point (SP) problems. Remarkably, the method only
requires access to linear minimization oracles. Leveraging recent advances in
FW optimization, we provide the first proof of convergence of a FW-type saddle
point solver over polytopes, thereby partially answering a 30 year-old
conjecture. We also survey other convergence results and highlight gaps in the
theoretical underpinnings of FW-style algorithms. Motivating applications
without known efficient alternatives are explored through structured prediction
with combinatorial penalties as well as games over matching polytopes involving
an exponential number of constraints.Comment: Appears in: Proceedings of the 20th International Conference on
Artificial Intelligence and Statistics (AISTATS 2017). 39 page
Stochastic Frank-Wolfe Methods for Nonconvex Optimization
We study Frank-Wolfe methods for nonconvex stochastic and finite-sum
optimization problems. Frank-Wolfe methods (in the convex case) have gained
tremendous recent interest in machine learning and optimization communities due
to their projection-free property and their ability to exploit structured
constraints. However, our understanding of these algorithms in the nonconvex
setting is fairly limited. In this paper, we propose nonconvex stochastic
Frank-Wolfe methods and analyze their convergence properties. For objective
functions that decompose into a finite-sum, we leverage ideas from variance
reduction techniques for convex optimization to obtain new variance reduced
nonconvex Frank-Wolfe methods that have provably faster convergence than the
classical Frank-Wolfe method. Finally, we show that the faster convergence
rates of our variance reduced methods also translate into improved convergence
rates for the stochastic setting
- …