20,828 research outputs found

    Combining search directions using gradient flows

    Get PDF
    The efficient combination of directions is a significant problem in line search methods that either use negative curvature. or wish to include additional information such as the gradient or different approximations to the Newton direction. In thls paper we describe a new procedure to combine several of these directions within an interior-point primal-dual algorithm. Basically. we combine in an efficient manner a modified Newton direction with the gradient of a merit function and a direction of negative curvature. is it exists. We also show that the procedure is well-defined. and it has reasonable theoretical properties regarding the convergence of the method. We also present numerical results from an implementation of the proposed algorithm on a set of small test problems from the CUTE collection

    Combining search directions using gradient flows

    Get PDF
    The original publication is available at www.springerlink.comThe efficient combination of directions is a significant problem in line search methods that either use negative curvature, or wish to include additional information such as the gradient or different approximations to the Newton direction. In this paper we describe a new procedure to combine several of these directions within an interior-point primal-dual algorithm. Basically, we combine in an efficient manner a modified Newton direction with the gradient of a merit function and a direction of negative curvature, if it exists.We also show that the procedure is well-defined, and it has reasonable theoretical properties regarding the rate of convergence of the method. We also present numerical results from an implementation of the proposed algorithm on a set of small test problems from the CUTE collection.Research supported by Spanish MEC grants BEC2000-0167 and PB98-0728Publicad

    A symmetric rank-one Quasi-Newton line-search method using negative curvature directions

    Get PDF
    We propose a quasi-Newton line-search method that uses negative curvature directions for solving unconstrained optimization problems. In this method, the symmetric rank-one (SR1) rule is used to update the Hessian approximation. The SR1 update rule is known to have a good numerical performance; however, it does not guarantee positive definiteness of the updated matrix. We first discuss the details of the proposed algorithm and then concentrate on its numerical efficiency. Our extensive computational study shows the potential of the proposed method from different angles, such as; its second order convergence behavior, its exceeding performance when compared to two other existing packages, and its computation profile illustrating the possible bottlenecks in the execution time. We then conclude the paper with the convergence analysis of the proposed method

    Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

    Full text link
    A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces. Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum. Here we argue, based on results from statistical physics, random matrix theory, neural network theory, and empirical evidence, that a deeper and more profound difficulty originates from the proliferation of saddle points, not local minima, especially in high dimensional problems of practical interest. Such saddle points are surrounded by high error plateaus that can dramatically slow down learning, and give the illusory impression of the existence of a local minimum. Motivated by these arguments, we propose a new approach to second-order optimization, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods. We apply this algorithm to deep or recurrent neural network training, and provide numerical evidence for its superior optimization performance.Comment: The theoretical review and analysis in this article draw heavily from arXiv:1405.4604 [cs.LG

    The geometry of nonlinear least squares with applications to sloppy models and optimization

    Full text link
    Parameter estimation by nonlinear least squares minimization is a common problem with an elegant geometric interpretation: the possible parameter values of a model induce a manifold in the space of data predictions. The minimization problem is then to find the point on the manifold closest to the data. We show that the model manifolds of a large class of models, known as sloppy models, have many universal features; they are characterized by a geometric series of widths, extrinsic curvatures, and parameter-effects curvatures. A number of common difficulties in optimizing least squares problems are due to this common structure. First, algorithms tend to run into the boundaries of the model manifold, causing parameters to diverge or become unphysical. We introduce the model graph as an extension of the model manifold to remedy this problem. We argue that appropriate priors can remove the boundaries and improve convergence rates. We show that typical fits will have many evaporated parameters. Second, bare model parameters are usually ill-suited to describing model behavior; cost contours in parameter space tend to form hierarchies of plateaus and canyons. Geometrically, we understand this inconvenient parametrization as an extremely skewed coordinate basis and show that it induces a large parameter-effects curvature on the manifold. Using coordinates based on geodesic motion, these narrow canyons are transformed in many cases into a single quadratic, isotropic basin. We interpret the modified Gauss-Newton and Levenberg-Marquardt fitting algorithms as an Euler approximation to geodesic motion in these natural coordinates on the model manifold and the model graph respectively. By adding a geodesic acceleration adjustment to these algorithms, we alleviate the difficulties from parameter-effects curvature, improving both efficiency and success rates at finding good fits.Comment: 40 pages, 29 Figure
    corecore