8,991 research outputs found
A Three-Term Conjugate Gradient Method with Sufficient Descent Property for Unconstrained Optimization
Conjugate gradient methods are widely used for solving large-scale unconstrained optimization problems, because they do not need the storage of matrices. In this paper, we propose a general form of three-term conjugate gradient methods which always generate a sufficient descent direction. We give a sufficient condition for the global convergence of the proposed general method. Moreover, we present a specific three-term conjugate gradient method based on the multi-step quasi-Newton method. Finally, some numerical results of the proposed method are given
A quasi-Newton proximal splitting method
A new result in convex analysis on the calculation of proximity operators in
certain scaled norms is derived. We describe efficient implementations of the
proximity calculation for a useful class of functions; the implementations
exploit the piece-wise linear nature of the dual problem. The second part of
the paper applies the previous result to acceleration of convex minimization
problems, and leads to an elegant quasi-Newton method. The optimization method
compares favorably against state-of-the-art alternatives. The algorithm has
extensive applications including signal processing, sparse recovery and machine
learning and classification
On limited-memory quasi-Newton methods for minimizing a quadratic function
The main focus in this paper is exact linesearch methods for minimizing a
quadratic function whose Hessian is positive definite. We give two classes of
limited-memory quasi-Newton Hessian approximations that generate search
directions parallel to those of the method of preconditioned conjugate
gradients, and hence give finite termination on quadratic optimization
problems. The Hessian approximations are described by a novel compact
representation which provides a dynamical framework. We also discuss possible
extensions of these classes and show their behavior on randomly generated
quadratic optimization problems. The methods behave numerically similar to
L-BFGS. Inclusion of information from the first iteration in the limited-memory
Hessian approximation and L-BFGS significantly reduces the effects of round-off
errors on the considered problems. In addition, we give our compact
representation of the Hessian approximations in the full Broyden class for the
general unconstrained optimization problem. This representation consists of
explicit matrices and gradients only as vector components
Computation of Ground States of the Gross-Pitaevskii Functional via Riemannian Optimization
In this paper we combine concepts from Riemannian Optimization and the theory
of Sobolev gradients to derive a new conjugate gradient method for direct
minimization of the Gross-Pitaevskii energy functional with rotation. The
conservation of the number of particles constrains the minimizers to lie on a
manifold corresponding to the unit norm. The idea developed here is to
transform the original constrained optimization problem to an unconstrained
problem on this (spherical) Riemannian manifold, so that fast minimization
algorithms can be applied as alternatives to more standard constrained
formulations. First, we obtain Sobolev gradients using an equivalent definition
of an inner product which takes into account rotation. Then, the
Riemannian gradient (RG) steepest descent method is derived based on projected
gradients and retraction of an intermediate solution back to the constraint
manifold. Finally, we use the concept of the Riemannian vector transport to
propose a Riemannian conjugate gradient (RCG) method for this problem. It is
derived at the continuous level based on the "optimize-then-discretize"
paradigm instead of the usual "discretize-then-optimize" approach, as this
ensures robustness of the method when adaptive mesh refinement is performed in
computations. We evaluate various design choices inherent in the formulation of
the method and conclude with recommendations concerning selection of the best
options. Numerical tests demonstrate that the proposed RCG method outperforms
the simple gradient descent (RG) method in terms of rate of convergence. While
on simple problems a Newton-type method implemented in the {\tt Ipopt} library
exhibits a faster convergence than the (RCG) approach, the two methods perform
similarly on more complex problems requiring the use of mesh adaptation. At the
same time the (RCG) approach has far fewer tunable parameters.Comment: 28 pages, 13 figure
Second order adjoints for solving PDE-constrained optimization problems
Inverse problems are of utmost importance in many fields of science and engineering. In the
variational approach inverse problems are formulated as PDE-constrained optimization problems,
where the optimal estimate of the uncertain parameters is the minimizer of a certain cost
functional subject to the constraints posed by the model equations. The numerical solution
of such optimization problems requires the computation of derivatives of the model output
with respect to model parameters. The first order derivatives of a cost functional (defined
on the model output) with respect to a large number of model parameters can be calculated
efficiently through first order adjoint sensitivity analysis. Second order adjoint models
give second derivative information in the form of matrix-vector products between the Hessian
of the cost functional and user defined vectors. Traditionally, the construction of second
order derivatives for large scale models has been considered too costly. Consequently, data
assimilation applications employ optimization algorithms that use only first order derivative
information, like nonlinear conjugate gradients and quasi-Newton methods.
In this paper we discuss the mathematical foundations of second order adjoint sensitivity
analysis and show that it provides an efficient approach to obtain Hessian-vector products. We
study the benefits of using of second order information in the numerical optimization process
for data assimilation applications. The numerical studies are performed in a twin experiment
setting with a two-dimensional shallow water model. Different scenarios are considered with
different discretization approaches, observation sets, and noise levels. Optimization algorithms
that employ second order derivatives are tested against widely used methods that require
only first order derivatives. Conclusions are drawn regarding the potential benefits and the
limitations of using high-order information in large scale data assimilation problems
- …