3,429 research outputs found
Time-parallel iterative solvers for parabolic evolution equations
We present original time-parallel algorithms for the solution of the implicit
Euler discretization of general linear parabolic evolution equations with
time-dependent self-adjoint spatial operators. Motivated by the inf-sup theory
of parabolic problems, we show that the standard nonsymmetric time-global
system can be equivalently reformulated as an original symmetric saddle-point
system that remains inf-sup stable with respect to the same natural parabolic
norms. We then propose and analyse an efficient and readily implementable
parallel-in-time preconditioner to be used with an inexact Uzawa method. The
proposed preconditioner is non-intrusive and easy to implement in practice, and
also features the key theoretical advantages of robust spectral bounds, leading
to convergence rates that are independent of the number of time-steps, final
time, or spatial mesh sizes, and also a theoretical parallel complexity that
grows only logarithmically with respect to the number of time-steps. Numerical
experiments with large-scale parallel computations show the effectiveness of
the method, along with its good weak and strong scaling properties
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information
We consider variants of trust-region and cubic regularization methods for
non-convex optimization, in which the Hessian matrix is approximated. Under
mild conditions on the inexact Hessian, and using approximate solution of the
corresponding sub-problems, we provide iteration complexity to achieve -approximate second-order optimality which have shown to be tight.
Our Hessian approximation conditions constitute a major relaxation over the
existing ones in the literature. Consequently, we are able to show that such
mild conditions allow for the construction of the approximate Hessian through
various random sampling methods. In this light, we consider the canonical
problem of finite-sum minimization, provide appropriate uniform and non-uniform
sub-sampling strategies to construct such Hessian approximations, and obtain
optimal iteration complexity for the corresponding sub-sampled trust-region and
cubic regularization methods.Comment: 32 page
Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis
Recently several methods were proposed for sparse optimization which make
careful use of second-order information [10, 28, 16, 3] to improve local
convergence rates. These methods construct a composite quadratic approximation
using Hessian information, optimize this approximation using a first-order
method, such as coordinate descent and employ a line search to ensure
sufficient descent. Here we propose a general framework, which includes
slightly modified versions of existing algorithms and also a new algorithm,
which uses limited memory BFGS Hessian approximations, and provide a novel
global convergence rate analysis, which covers methods that solve subproblems
via coordinate descent
Randomized Low-Memory Singular Value Projection
Affine rank minimization algorithms typically rely on calculating the
gradient of a data error followed by a singular value decomposition at every
iteration. Because these two steps are expensive, heuristic approximations are
often used to reduce computational burden. To this end, we propose a recovery
scheme that merges the two steps with randomized approximations, and as a
result, operates on space proportional to the degrees of freedom in the
problem. We theoretically establish the estimation guarantees of the algorithm
as a function of approximation tolerance. While the theoretical approximation
requirements are overly pessimistic, we demonstrate that in practice the
algorithm performs well on the quantum tomography recovery problem.Comment: 13 pages. This version has a revised theorem and new numerical
experiment
Adaptive Regularization Algorithms with Inexact Evaluations for Nonconvex Optimization
A regularization algorithm using inexact function values and inexact
derivatives is proposed and its evaluation complexity analyzed. This algorithm
is applicable to unconstrained problems and to problems with inexpensive
constraints (that is constraints whose evaluation and enforcement has
negligible cost) under the assumption that the derivative of highest degree is
-H\"{o}lder continuous. It features a very flexible adaptive mechanism
for determining the inexactness which is allowed, at each iteration, when
computing objective function values and derivatives. The complexity analysis
covers arbitrary optimality order and arbitrary degree of available approximate
derivatives. It extends results of Cartis, Gould and Toint (2018) on the
evaluation complexity to the inexact case: if a th order minimizer is sought
using approximations to the first derivatives, it is proved that a suitable
approximate minimizer within is computed by the proposed algorithm
in at most iterations and at most
approximate
evaluations. An algorithmic variant, although more rigid in practice, can be
proved to find such an approximate minimizer in
evaluations.While
the proposed framework remains so far conceptual for high degrees and orders,
it is shown to yield simple and computationally realistic inexact methods when
specialized to the unconstrained and bound-constrained first- and second-order
cases. The deterministic complexity results are finally extended to the
stochastic context, yielding adaptive sample-size rules for subsampling methods
typical of machine learning.Comment: 32 page
GIANT: Globally Improved Approximate Newton Method for Distributed Optimization
For distributed computing environment, we consider the empirical risk
minimization problem and propose a distributed and communication-efficient
Newton-type optimization method. At every iteration, each worker locally finds
an Approximate NewTon (ANT) direction, which is sent to the main driver. The
main driver, then, averages all the ANT directions received from workers to
form a {\it Globally Improved ANT} (GIANT) direction. GIANT is highly
communication efficient and naturally exploits the trade-offs between local
computations and global communications in that more local computations result
in fewer overall rounds of communications. Theoretically, we show that GIANT
enjoys an improved convergence rate as compared with first-order methods and
existing distributed Newton-type methods. Further, and in sharp contrast with
many existing distributed Newton-type methods, as well as popular first-order
methods, a highly advantageous practical feature of GIANT is that it only
involves one tuning parameter. We conduct large-scale experiments on a computer
cluster and, empirically, demonstrate the superior performance of GIANT.Comment: Fixed some typos. Improved writin
Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice
We introduce a generic scheme for accelerating gradient-based optimization
methods in the sense of Nesterov. The approach, called Catalyst, builds upon
the inexact accelerated proximal point algorithm for minimizing a convex
objective function, and consists of approximately solving a sequence of
well-chosen auxiliary problems, leading to faster convergence. One of the keys
to achieve acceleration in theory and in practice is to solve these
sub-problems with appropriate accuracy by using the right stopping criterion
and the right warm-start strategy. We give practical guidelines to use Catalyst
and present a comprehensive analysis of its global complexity. We show that
Catalyst applies to a large class of algorithms, including gradient descent,
block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG,
MISO/Finito, and their proximal variants. For all of these methods, we
establish faster rates using the Catalyst acceleration, for strongly convex and
non-strongly convex objectives. We conclude with extensive experiments showing
that acceleration is useful in practice, especially for ill-conditioned
problems.Comment: link to publisher website:
http://jmlr.org/papers/volume18/17-748/17-748.pd
- …