8,435 research outputs found
Objective acceleration for unconstrained optimization
Acceleration schemes can dramatically improve existing optimization
procedures. In most of the work on these schemes, such as nonlinear Generalized
Minimal Residual (N-GMRES), acceleration is based on minimizing the
norm of some target on subspaces of . There are many numerical
examples that show how accelerating general purpose and domain-specific
optimizers with N-GMRES results in large improvements. We propose a natural
modification to N-GMRES, which significantly improves the performance in a
testing environment originally used to advocate N-GMRES. Our proposed approach,
which we refer to as O-ACCEL (Objective Acceleration), is novel in that it
minimizes an approximation to the \emph{objective function} on subspaces of
. We prove that O-ACCEL reduces to the Full Orthogonalization
Method for linear systems when the objective is quadratic, which differentiates
our proposed approach from existing acceleration methods. Comparisons with
L-BFGS and N-CG indicate the competitiveness of O-ACCEL. As it can be combined
with domain-specific optimizers, it may also be beneficial in areas where
L-BFGS or N-CG are not suitable.Comment: 18 pages, 6 figures, 5 table
Vector instabilities and self-acceleration in the decoupling limit of massive gravity
We investigate vector contributions to the Lagrangian of massive
gravity in the decoupling limit, the less explored sector of this theory. The
main purpose is to understand the stability of maximally symmetric
%self-accelerating vacuum solutions. Around self-accelerating configurations,
vector degrees of freedom become strongly coupled since their kinetic terms
vanish, so their dynamics is controlled by higher order interactions. Even in
the decoupling limit, the vector Lagrangian contains an infinite number of
terms. We develop a systematic method to covariantly determine the vector
Lagrangian at each order in perturbations, fully manifesting the symmetries of
the system. We show that, around self-accelerating solutions, the structure of
higher order -form Galileons arise, avoiding the emergence of a sixth BD
ghost mode. However, a careful analysis shows that there are directions along
which the Hamiltonian is unbounded from below. This instability can be
interpreted as one of the available fifth physical modes behaving as a ghost.
Therefore, we conclude that self-accelerating configurations, in the decoupling
limit of -massive gravity, are generically unstable.Comment: 16 pages, 2 figure
Adaptive Momentum for Neural Network Optimization
In this thesis, we develop a novel and efficient algorithm for optimizing neural networks inspired by a recently proposed geodesic optimization algorithm. Our algorithm, which we call Stochastic Geodesic Optimization (SGeO), utilizes an adaptive coefficient on top of Polyaks Heavy Ball method effectively controlling the amount of weight put on the previous update to the parameters based on the change of direction in the optimization path. Experimental results on strongly convex functions with Lipschitz gradients and deep Autoencoder benchmarks show that SGeO reaches lower errors than established first-order methods and competes well with lower or similar errors to a recent second-order method called K-FAC (Kronecker-Factored Approximate Curvature). We also incorporate Nesterov style lookahead gradient into our algorithm (SGeO-N) and observe notable improvements. We believe that our research will open up new directions for high-dimensional neural network optimization where combining the efficiency of first-order methods and the effectiveness of second-order methods proves a promising avenue to explore
Chronological Inversion Method for the Dirac Matrix in Hybrid Monte Carlo
In Hybrid Monte Carlo simulations for full QCD, the gauge fields evolve
smoothly as a function of Molecular Dynamics time. Here we investigate improved
methods of estimating the trial or starting solutions for the Dirac matrix
inversion as superpositions of a chronological sequence of solutions in the
recent past. By taking as the trial solution the vector which minimizes the
residual in the linear space spanned by the past solutions, the number of
conjugate gradient iterations per unit MD time is decreased by at least a
factor of 2. Extensions of this basic approach to precondition the conjugate
gradient iterations are also discussed.Comment: 35 pages, 18 EPS figures A new "preconditioning" method, derived from
the Chronological Inversion, is described. Some new figures are appended.
Some reorganization of the material has taken plac
- …