776 research outputs found
Series of Hessian-Vector Products for Tractable Saddle-Free Newton Optimisation of Neural Networks
Despite their popularity in the field of continuous optimisation,
second-order quasi-Newton methods are challenging to apply in machine learning,
as the Hessian matrix is intractably large. This computational burden is
exacerbated by the need to address non-convexity, for instance by modifying the
Hessian's eigenvalues as in Saddle-Free Newton methods. We propose an
optimisation algorithm which addresses both of these concerns - to our
knowledge, the first efficiently-scalable optimisation algorithm to
asymptotically use the exact (eigenvalue-modified) inverse Hessian. Our method
frames the problem as a series which principally square-roots and inverts the
squared Hessian, then uses it to precondition a gradient vector, all without
explicitly computing or eigendecomposing the Hessian. A truncation of this
infinite series provides a new optimisation algorithm which is scalable and
comparable to other first- and second-order optimisation methods in both
runtime and optimisation performance. We demonstrate this in a variety of
settings, including a ResNet-18 trained on CIFAR-10.Comment: 36 pages, 10 figures, 5 tables. Submitted to TMLR. First two authors'
order randomise
Improved monotone polynomial fitting with applications and variable selection
We investigate existing and new isotonic parameterisations for monotone polynomials, the latter which have been previously unconsidered in the statistical literature. We show that this new parameterisation is faster and more flexible than its alternatives enabling polynomials to be constrained to be monotone over either a compact interval or a semi-compact interval of the form [a;∞), in addition to over the whole real line. Due to the speed and efficiency of algorithms based on our new parameterisation the use of standard bootstrap methodology becomes feasible. We investigate the use of the bootstrap under monotonicity constraints to obtain confidence and prediction bands for the fitted curves and show that an adjustment by using either the ‘m out of n’ bootstrap or a post hoc symmetrisation of the confidence bands is necessary to achieve more uniform coverage probabilities. However, the same such adjustments appear unwarranted for prediction bands. Furthermore, we examine the model selection problem, not only for monotone polynomials, but also in a general sense, with a focus on graphical methods. Specifically, we describe how to visualize measures of description loss and of model complexity to facilitate the model selection problem. We advocate the use of the bootstrap to assess the stability of selected models and to enhance our graphical tools and demonstrate which variables are important using variable inclusion plots, showing that these can be invaluable plots for the model building process. We also describe methods for using the ‘m out of n’ bootstrap to select the degree of the fitted monotone polynomial and demonstrate it’s effectiveness in the specific constrained regression scenario. We demonstrate the effectiveness of all of these methods using numerous case studies, which highlight the necessity and usefulness of our techniques. All algorithms discussed in this thesis are available in the R package MonoPoly (version 0.3-6 or later)
Imfit: A Fast, Flexible New Program for Astronomical Image Fitting
I describe a new, open-source astronomical image-fitting program called
Imfit, specialized for galaxies but potentially useful for other sources, which
is fast, flexible, and highly extensible. A key characteristic of the program
is an object-oriented design which allows new types of image components (2D
surface-brightness functions) to be easily written and added to the program.
Image functions provided with Imfit include the usual suspects for galaxy
decompositions (Sersic, exponential, Gaussian), along with Core-Sersic and
broken-exponential profiles, elliptical rings, and three components which
perform line-of-sight integration through 3D luminosity-density models of disks
and rings seen at arbitrary inclinations.
Available minimization algorithms include Levenberg-Marquardt, Nelder-Mead
simplex, and Differential Evolution, allowing trade-offs between speed and
decreased sensitivity to local minima in the fit landscape. Minimization can be
done using the standard chi^2 statistic (using either data or model values to
estimate per-pixel Gaussian errors, or else user-supplied error images) or
Poisson-based maximum-likelihood statistics; the latter approach is
particularly appropriate for cases of Poisson data in the low-count regime. I
show that fitting low-S/N galaxy images using chi^2 minimization and
individual-pixel Gaussian uncertainties can lead to significant biases in
fitted parameter values, which are avoided if a Poisson-based statistic is
used; this is true even when Gaussian read noise is present.Comment: pdflatex, 27 pages, 19 figures. Revised version, accepted by ApJ.
Programs, source code, and documentation available at:
http://www.mpe.mpg.de/~erwin/code/imfit
The application of neural networks in active suspension
This thesis considers the application of neural networks to automotive suspension
systems. In particular their ability to learn non-linear feedback control
relationships. The speed of processing, once trained, means that neural networks
open up new opportunities and allow increased complexity in the control
strategies employed.
The suitability of neural networks for this task is demonstrated here using multilayer
perceptron, (MLP) feed forward neural networks applied to a quarter vehicle
simulation model. Initially neural networks are trained from a training data set
created using a non-linear optimal control strategy, the complexity of which
prohibits its direct use. They are shown to be successful in learning the
relationship between the current system states and the optimal control. [Continues.
- …