8,435 research outputs found

    Objective acceleration for unconstrained optimization

    Full text link
    Acceleration schemes can dramatically improve existing optimization procedures. In most of the work on these schemes, such as nonlinear Generalized Minimal Residual (N-GMRES), acceleration is based on minimizing the â„“2\ell_2 norm of some target on subspaces of Rn\mathbb{R}^n. There are many numerical examples that show how accelerating general purpose and domain-specific optimizers with N-GMRES results in large improvements. We propose a natural modification to N-GMRES, which significantly improves the performance in a testing environment originally used to advocate N-GMRES. Our proposed approach, which we refer to as O-ACCEL (Objective Acceleration), is novel in that it minimizes an approximation to the \emph{objective function} on subspaces of Rn\mathbb{R}^n. We prove that O-ACCEL reduces to the Full Orthogonalization Method for linear systems when the objective is quadratic, which differentiates our proposed approach from existing acceleration methods. Comparisons with L-BFGS and N-CG indicate the competitiveness of O-ACCEL. As it can be combined with domain-specific optimizers, it may also be beneficial in areas where L-BFGS or N-CG are not suitable.Comment: 18 pages, 6 figures, 5 table

    Vector instabilities and self-acceleration in the decoupling limit of massive gravity

    Full text link
    We investigate vector contributions to the Lagrangian of Λ3−\Lambda_3-massive gravity in the decoupling limit, the less explored sector of this theory. The main purpose is to understand the stability of maximally symmetric %self-accelerating vacuum solutions. Around self-accelerating configurations, vector degrees of freedom become strongly coupled since their kinetic terms vanish, so their dynamics is controlled by higher order interactions. Even in the decoupling limit, the vector Lagrangian contains an infinite number of terms. We develop a systematic method to covariantly determine the vector Lagrangian at each order in perturbations, fully manifesting the symmetries of the system. We show that, around self-accelerating solutions, the structure of higher order pp-form Galileons arise, avoiding the emergence of a sixth BD ghost mode. However, a careful analysis shows that there are directions along which the Hamiltonian is unbounded from below. This instability can be interpreted as one of the available fifth physical modes behaving as a ghost. Therefore, we conclude that self-accelerating configurations, in the decoupling limit of Λ3\Lambda_3-massive gravity, are generically unstable.Comment: 16 pages, 2 figure

    Adaptive Momentum for Neural Network Optimization

    Get PDF
    In this thesis, we develop a novel and efficient algorithm for optimizing neural networks inspired by a recently proposed geodesic optimization algorithm. Our algorithm, which we call Stochastic Geodesic Optimization (SGeO), utilizes an adaptive coefficient on top of Polyaks Heavy Ball method effectively controlling the amount of weight put on the previous update to the parameters based on the change of direction in the optimization path. Experimental results on strongly convex functions with Lipschitz gradients and deep Autoencoder benchmarks show that SGeO reaches lower errors than established first-order methods and competes well with lower or similar errors to a recent second-order method called K-FAC (Kronecker-Factored Approximate Curvature). We also incorporate Nesterov style lookahead gradient into our algorithm (SGeO-N) and observe notable improvements. We believe that our research will open up new directions for high-dimensional neural network optimization where combining the efficiency of first-order methods and the effectiveness of second-order methods proves a promising avenue to explore

    Chronological Inversion Method for the Dirac Matrix in Hybrid Monte Carlo

    Full text link
    In Hybrid Monte Carlo simulations for full QCD, the gauge fields evolve smoothly as a function of Molecular Dynamics time. Here we investigate improved methods of estimating the trial or starting solutions for the Dirac matrix inversion as superpositions of a chronological sequence of solutions in the recent past. By taking as the trial solution the vector which minimizes the residual in the linear space spanned by the past solutions, the number of conjugate gradient iterations per unit MD time is decreased by at least a factor of 2. Extensions of this basic approach to precondition the conjugate gradient iterations are also discussed.Comment: 35 pages, 18 EPS figures A new "preconditioning" method, derived from the Chronological Inversion, is described. Some new figures are appended. Some reorganization of the material has taken plac
    • …
    corecore