6,280 research outputs found
Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond
In this paper, we provide near-optimal accelerated first-order methods for
minimizing a broad class of smooth nonconvex functions that are strictly
unimodal on all lines through a minimizer. This function class, which we call
the class of smooth quasar-convex functions, is parameterized by a constant
, where encompasses the classes of smooth convex
and star-convex functions, and smaller values of indicate that the
function can be "more nonconvex." We develop a variant of accelerated gradient
descent that computes an -approximate minimizer of a smooth
-quasar-convex function with at most total function and gradient evaluations. We
also derive a lower bound of on the
number of gradient evaluations required by any deterministic first-order method
in the worst case, showing that, up to a logarithmic factor, no deterministic
first-order algorithm can improve upon ours.Comment: 37 page
Quasiconvex Programming
We define quasiconvex programming, a form of generalized linear programming
in which one seeks the point minimizing the pointwise maximum of a collection
of quasiconvex functions. We survey algorithms for solving quasiconvex programs
either numerically or via generalizations of the dual simplex method from
linear programming, and describe varied applications of this geometric
optimization technique in meshing, scientific computation, information
visualization, automated algorithm analysis, and robust statistics.Comment: 33 pages, 14 figure
A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning
Learning sparse combinations is a frequent theme in machine learning. In this
paper, we study its associated optimization problem in the distributed setting
where the elements to be combined are not centrally located but spread over a
network. We address the key challenges of balancing communication costs and
optimization errors. To this end, we propose a distributed Frank-Wolfe (dFW)
algorithm. We obtain theoretical guarantees on the optimization error
and communication cost that do not depend on the total number of
combining elements. We further show that the communication cost of dFW is
optimal by deriving a lower-bound on the communication cost required to
construct an -approximate solution. We validate our theoretical
analysis with empirical studies on synthetic and real-world data, which
demonstrate that dFW outperforms both baselines and competing methods. We also
study the performance of dFW when the conditions of our analysis are relaxed,
and show that dFW is fairly robust.Comment: Extended version of the SIAM Data Mining 2015 pape
Approximating gradients with continuous piecewise polynomial functions
Motivated by conforming finite element methods for elliptic problems of
second order, we analyze the approximation of the gradient of a target function
by continuous piecewise polynomial functions over a simplicial mesh. The main
result is that the global best approximation error is equivalent to an
appropriate sum in terms of the local best approximations errors on elements.
Thus, requiring continuity does not downgrade local approximability and
discontinuous piecewise polynomials essentially do not offer additional
approximation power, even for a fixed mesh. This result implies error bounds in
terms of piecewise regularity over the whole admissible smoothness range.
Moreover, it allows for simple local error functionals in adaptive tree
approximation of gradients.Comment: 21 pages, 1 figur
- …