776 research outputs found

    Series of Hessian-Vector Products for Tractable Saddle-Free Newton Optimisation of Neural Networks

    Full text link
    Despite their popularity in the field of continuous optimisation, second-order quasi-Newton methods are challenging to apply in machine learning, as the Hessian matrix is intractably large. This computational burden is exacerbated by the need to address non-convexity, for instance by modifying the Hessian's eigenvalues as in Saddle-Free Newton methods. We propose an optimisation algorithm which addresses both of these concerns - to our knowledge, the first efficiently-scalable optimisation algorithm to asymptotically use the exact (eigenvalue-modified) inverse Hessian. Our method frames the problem as a series which principally square-roots and inverts the squared Hessian, then uses it to precondition a gradient vector, all without explicitly computing or eigendecomposing the Hessian. A truncation of this infinite series provides a new optimisation algorithm which is scalable and comparable to other first- and second-order optimisation methods in both runtime and optimisation performance. We demonstrate this in a variety of settings, including a ResNet-18 trained on CIFAR-10.Comment: 36 pages, 10 figures, 5 tables. Submitted to TMLR. First two authors' order randomise

    Improved monotone polynomial fitting with applications and variable selection

    Get PDF
    We investigate existing and new isotonic parameterisations for monotone polynomials, the latter which have been previously unconsidered in the statistical literature. We show that this new parameterisation is faster and more flexible than its alternatives enabling polynomials to be constrained to be monotone over either a compact interval or a semi-compact interval of the form [a;∞), in addition to over the whole real line. Due to the speed and efficiency of algorithms based on our new parameterisation the use of standard bootstrap methodology becomes feasible. We investigate the use of the bootstrap under monotonicity constraints to obtain confidence and prediction bands for the fitted curves and show that an adjustment by using either the ‘m out of n’ bootstrap or a post hoc symmetrisation of the confidence bands is necessary to achieve more uniform coverage probabilities. However, the same such adjustments appear unwarranted for prediction bands. Furthermore, we examine the model selection problem, not only for monotone polynomials, but also in a general sense, with a focus on graphical methods. Specifically, we describe how to visualize measures of description loss and of model complexity to facilitate the model selection problem. We advocate the use of the bootstrap to assess the stability of selected models and to enhance our graphical tools and demonstrate which variables are important using variable inclusion plots, showing that these can be invaluable plots for the model building process. We also describe methods for using the ‘m out of n’ bootstrap to select the degree of the fitted monotone polynomial and demonstrate it’s effectiveness in the specific constrained regression scenario. We demonstrate the effectiveness of all of these methods using numerous case studies, which highlight the necessity and usefulness of our techniques. All algorithms discussed in this thesis are available in the R package MonoPoly (version 0.3-6 or later)

    Imfit: A Fast, Flexible New Program for Astronomical Image Fitting

    Full text link
    I describe a new, open-source astronomical image-fitting program called Imfit, specialized for galaxies but potentially useful for other sources, which is fast, flexible, and highly extensible. A key characteristic of the program is an object-oriented design which allows new types of image components (2D surface-brightness functions) to be easily written and added to the program. Image functions provided with Imfit include the usual suspects for galaxy decompositions (Sersic, exponential, Gaussian), along with Core-Sersic and broken-exponential profiles, elliptical rings, and three components which perform line-of-sight integration through 3D luminosity-density models of disks and rings seen at arbitrary inclinations. Available minimization algorithms include Levenberg-Marquardt, Nelder-Mead simplex, and Differential Evolution, allowing trade-offs between speed and decreased sensitivity to local minima in the fit landscape. Minimization can be done using the standard chi^2 statistic (using either data or model values to estimate per-pixel Gaussian errors, or else user-supplied error images) or Poisson-based maximum-likelihood statistics; the latter approach is particularly appropriate for cases of Poisson data in the low-count regime. I show that fitting low-S/N galaxy images using chi^2 minimization and individual-pixel Gaussian uncertainties can lead to significant biases in fitted parameter values, which are avoided if a Poisson-based statistic is used; this is true even when Gaussian read noise is present.Comment: pdflatex, 27 pages, 19 figures. Revised version, accepted by ApJ. Programs, source code, and documentation available at: http://www.mpe.mpg.de/~erwin/code/imfit

    The application of neural networks in active suspension

    Get PDF
    This thesis considers the application of neural networks to automotive suspension systems. In particular their ability to learn non-linear feedback control relationships. The speed of processing, once trained, means that neural networks open up new opportunities and allow increased complexity in the control strategies employed. The suitability of neural networks for this task is demonstrated here using multilayer perceptron, (MLP) feed forward neural networks applied to a quarter vehicle simulation model. Initially neural networks are trained from a training data set created using a non-linear optimal control strategy, the complexity of which prohibits its direct use. They are shown to be successful in learning the relationship between the current system states and the optimal control. [Continues.
    corecore