8,991 research outputs found

    A Three-Term Conjugate Gradient Method with Sufficient Descent Property for Unconstrained Optimization

    Get PDF
    Conjugate gradient methods are widely used for solving large-scale unconstrained optimization problems, because they do not need the storage of matrices. In this paper, we propose a general form of three-term conjugate gradient methods which always generate a sufficient descent direction. We give a sufficient condition for the global convergence of the proposed general method. Moreover, we present a specific three-term conjugate gradient method based on the multi-step quasi-Newton method. Finally, some numerical results of the proposed method are given

    A quasi-Newton proximal splitting method

    Get PDF
    A new result in convex analysis on the calculation of proximity operators in certain scaled norms is derived. We describe efficient implementations of the proximity calculation for a useful class of functions; the implementations exploit the piece-wise linear nature of the dual problem. The second part of the paper applies the previous result to acceleration of convex minimization problems, and leads to an elegant quasi-Newton method. The optimization method compares favorably against state-of-the-art alternatives. The algorithm has extensive applications including signal processing, sparse recovery and machine learning and classification

    On limited-memory quasi-Newton methods for minimizing a quadratic function

    Full text link
    The main focus in this paper is exact linesearch methods for minimizing a quadratic function whose Hessian is positive definite. We give two classes of limited-memory quasi-Newton Hessian approximations that generate search directions parallel to those of the method of preconditioned conjugate gradients, and hence give finite termination on quadratic optimization problems. The Hessian approximations are described by a novel compact representation which provides a dynamical framework. We also discuss possible extensions of these classes and show their behavior on randomly generated quadratic optimization problems. The methods behave numerically similar to L-BFGS. Inclusion of information from the first iteration in the limited-memory Hessian approximation and L-BFGS significantly reduces the effects of round-off errors on the considered problems. In addition, we give our compact representation of the Hessian approximations in the full Broyden class for the general unconstrained optimization problem. This representation consists of explicit matrices and gradients only as vector components

    Computation of Ground States of the Gross-Pitaevskii Functional via Riemannian Optimization

    Full text link
    In this paper we combine concepts from Riemannian Optimization and the theory of Sobolev gradients to derive a new conjugate gradient method for direct minimization of the Gross-Pitaevskii energy functional with rotation. The conservation of the number of particles constrains the minimizers to lie on a manifold corresponding to the unit L2L^2 norm. The idea developed here is to transform the original constrained optimization problem to an unconstrained problem on this (spherical) Riemannian manifold, so that fast minimization algorithms can be applied as alternatives to more standard constrained formulations. First, we obtain Sobolev gradients using an equivalent definition of an H1H^1 inner product which takes into account rotation. Then, the Riemannian gradient (RG) steepest descent method is derived based on projected gradients and retraction of an intermediate solution back to the constraint manifold. Finally, we use the concept of the Riemannian vector transport to propose a Riemannian conjugate gradient (RCG) method for this problem. It is derived at the continuous level based on the "optimize-then-discretize" paradigm instead of the usual "discretize-then-optimize" approach, as this ensures robustness of the method when adaptive mesh refinement is performed in computations. We evaluate various design choices inherent in the formulation of the method and conclude with recommendations concerning selection of the best options. Numerical tests demonstrate that the proposed RCG method outperforms the simple gradient descent (RG) method in terms of rate of convergence. While on simple problems a Newton-type method implemented in the {\tt Ipopt} library exhibits a faster convergence than the (RCG) approach, the two methods perform similarly on more complex problems requiring the use of mesh adaptation. At the same time the (RCG) approach has far fewer tunable parameters.Comment: 28 pages, 13 figure

    Second order adjoints for solving PDE-constrained optimization problems

    Get PDF
    Inverse problems are of utmost importance in many fields of science and engineering. In the variational approach inverse problems are formulated as PDE-constrained optimization problems, where the optimal estimate of the uncertain parameters is the minimizer of a certain cost functional subject to the constraints posed by the model equations. The numerical solution of such optimization problems requires the computation of derivatives of the model output with respect to model parameters. The first order derivatives of a cost functional (defined on the model output) with respect to a large number of model parameters can be calculated efficiently through first order adjoint sensitivity analysis. Second order adjoint models give second derivative information in the form of matrix-vector products between the Hessian of the cost functional and user defined vectors. Traditionally, the construction of second order derivatives for large scale models has been considered too costly. Consequently, data assimilation applications employ optimization algorithms that use only first order derivative information, like nonlinear conjugate gradients and quasi-Newton methods. In this paper we discuss the mathematical foundations of second order adjoint sensitivity analysis and show that it provides an efficient approach to obtain Hessian-vector products. We study the benefits of using of second order information in the numerical optimization process for data assimilation applications. The numerical studies are performed in a twin experiment setting with a two-dimensional shallow water model. Different scenarios are considered with different discretization approaches, observation sets, and noise levels. Optimization algorithms that employ second order derivatives are tested against widely used methods that require only first order derivatives. Conclusions are drawn regarding the potential benefits and the limitations of using high-order information in large scale data assimilation problems
    • …
    corecore