25,670 research outputs found
Beyond Convexity: Stochastic Quasi-Convex Optimization
Stochastic convex optimization is a basic and well studied primitive in
machine learning. It is well known that convex and Lipschitz functions can be
minimized efficiently using Stochastic Gradient Descent (SGD). The Normalized
Gradient Descent (NGD) algorithm, is an adaptation of Gradient Descent, which
updates according to the direction of the gradients, rather than the gradients
themselves. In this paper we analyze a stochastic version of NGD and prove its
convergence to a global minimum for a wider class of functions: we require the
functions to be quasi-convex and locally-Lipschitz. Quasi-convexity broadens
the con- cept of unimodality to multidimensions and allows for certain types of
saddle points, which are a known hurdle for first-order optimization methods
such as gradient descent. Locally-Lipschitz functions are only required to be
Lipschitz in a small region around the optimum. This assumption circumvents
gradient explosion, which is another known hurdle for gradient descent
variants. Interestingly, unlike the vanilla SGD algorithm, the stochastic
normalized gradient descent algorithm provably requires a minimal minibatch
size
Deterministic global optimization using space-filling curves and multiple estimates of Lipschitz and Holder constants
In this paper, the global optimization problem with
being a hyperinterval in and satisfying the Lipschitz condition
with an unknown Lipschitz constant is considered. It is supposed that the
function can be multiextremal, non-differentiable, and given as a
`black-box'. To attack the problem, a new global optimization algorithm based
on the following two ideas is proposed and studied both theoretically and
numerically. First, the new algorithm uses numerical approximations to
space-filling curves to reduce the original Lipschitz multi-dimensional problem
to a univariate one satisfying the H\"{o}lder condition. Second, the algorithm
at each iteration applies a new geometric technique working with a number of
possible H\"{o}lder constants chosen from a set of values varying from zero to
infinity showing so that ideas introduced in a popular DIRECT method can be
used in the H\"{o}lder global optimization. Convergence conditions of the
resulting deterministic global optimization method are established. Numerical
experiments carried out on several hundreds of test functions show quite a
promising performance of the new algorithm in comparison with its direct
competitors.Comment: 26 pages, 10 figures, 4 table
Solving non-monotone equilibrium problems via a DIRECT-type approach
A global optimization approach for solving non-monotone equilibrium problems
(EPs) is proposed. The class of (regularized) gap functions is used to
reformulate any EP as a constrained global optimization program and some bounds
on the Lipschitz constant of such functions are provided. The proposed global
optimization approach is a combination of an improved version of the
\texttt{DIRECT} algorithm, which exploits local bounds of the Lipschitz
constant of the objective function, with local minimizations. Unlike most
existing solution methods for EPs, no monotonicity-type condition is assumed in
this paper. Preliminary numerical results on several classes of EPs show the
effectiveness of the approach.Comment: Technical Report of Department of Computer Science, University of
Pisa, Ital
Extended cutting angle method of global optimization
Methods of Lipschitz optimization allow one to find and confirm the global minimum of multivariate Lipschitz functions using a finite number of function evaluations. This paper extends the Cutting Angle method, in which the optimization problem is solved by building a sequence of piecewise linear underestimates of the objective function. We use a more flexible set of support functions, which yields a better underestimate of a Lipschitz objective function. An efficient algorithm for enumeration of all local minima of the underestimate is presented, along with the results of numerical experiments. One dimensional Pijavski-Shubert method arises as a special case of the proposed approach.<br /
Index Information Algorithm with Local Tuning for Solving Multidimensional Global Optimization Problems with Multiextremal Constraints
Multidimensional optimization problems where the objective function and the
constraints are multiextremal non-differentiable Lipschitz functions (with
unknown Lipschitz constants) and the feasible region is a finite collection of
robust nonconvex subregions are considered. Both the objective function and the
constraints may be partially defined. To solve such problems an algorithm is
proposed, that uses Peano space-filling curves and the index scheme to reduce
the original problem to a H\"{o}lder one-dimensional one. Local tuning on the
behaviour of the objective function and constraints is used during the work of
the global optimization procedure in order to accelerate the search. The method
neither uses penalty coefficients nor additional variables. Convergence
conditions are established. Numerical experiments confirm the good performance
of the technique.Comment: 29 pages, 5 figure
Optimal Algorithms for Non-Smooth Distributed Optimization in Networks
In this work, we consider the distributed optimization of non-smooth convex
functions using a network of computing units. We investigate this problem under
two regularity assumptions: (1) the Lipschitz continuity of the global
objective function, and (2) the Lipschitz continuity of local individual
functions. Under the local regularity assumption, we provide the first optimal
first-order decentralized algorithm called multi-step primal-dual (MSPD) and
its corresponding optimal convergence rate. A notable aspect of this result is
that, for non-smooth functions, while the dominant term of the error is in
, the structure of the communication network only impacts a
second-order term in , where is time. In other words, the error due
to limits in communication resources decreases at a fast rate even in the case
of non-strongly-convex objective functions. Under the global regularity
assumption, we provide a simple yet efficient algorithm called distributed
randomized smoothing (DRS) based on a local smoothing of the objective
function, and show that DRS is within a multiplicative factor of the
optimal convergence rate, where is the underlying dimension.Comment: 17 page
- …