254 research outputs found

    Analysis of Natural Gradient Descent for Multilayer Neural Networks

    Get PDF
    Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter space to redefine the direction of steepest descent. The algorithm is examined via methods of statistical physics which accurately characterize both transient and asymptotic behavior. A solution of the learning dynamics is obtained for the case of multilayer neural network training in the limit of large input dimension. We find that natural gradient learning leads to optimal asymptotic performance and outperforms gradient descent in the transient, significantly shortening or even removing plateaus in the transient generalization performance which typically hamper gradient descent training.Comment: 14 pages including figures. To appear in Physical Review

    Learning with regularizers in multilayer neural networks

    Get PDF
    We study the effect of regularization in an on-line gradient-descent learning scenario for a general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labelled by a two-layer teacher network with an arbitrary number of hidden units which may be corrupted by Gaussian output noise. We examine the effect of weight decay regularization on the dynamical evolution of the order parameters and generalization error in various phases of the learning process, in both noiseless and noisy scenarios

    Natural gradient matrix momentum

    Get PDF
    Natural gradient learning is an efficient and principled method for improving on-line learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out efficient inversion and study the efficacy of a single step estimation of the Fisher information matrix. We analyse the proposed algorithm in a two-layer network, using a statistical mechanics framework which allows us to describe analytically the learning dynamics, and compare performance with true natural gradient learning and standard gradient descent

    Globally optimal learning rates in multilayer neural networks

    Get PDF
    A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule

    Globally optimal on-line learning rules

    Get PDF
    We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule

    Noise, regularizers, and unrealizable scenarios in online learning from restricted training sets

    Get PDF
    We study the dynamics of on-line learning in multilayer neural networks where training examples are sampled with repetition and where the number of examples scales with the number of network weights. The analysis is carried out using the dynamical replica method aimed at obtaining a closed set of coupled equations for a set of macroscopic variables from which both training and generalization errors can be calculated. We focus on scenarios whereby training examples are corrupted by additive Gaussian output noise and regularizers are introduced to improve the network performance. The dependence of the dynamics on the noise level, with and without regularizers, is examined, as well as that of the asymptotic values obtained for both training and generalization errors. We also demonstrate the ability of the method to approximate the learning dynamics in structurally unrealizable scenarios. The theoretical results show good agreement with those obtained by computer simulations

    An Analysis of Pharmacogenomic-Guided Pathways and Their Effect on Medication Changes and Hospital Admissions: A Systematic Review and Meta-Analysis

    Get PDF
    Ninety-five percent of the population are estimated to carry at least one genetic variant that is discordant with at least one medication. Pharmacogenomic (PGx) testing has the potential to identify patients with genetic variants that puts them at risk of adverse drug reactions and sub-optimal therapy. Predicting a patient's response to medications could support the safe management of medications and reduce hospitalization. These benefits can only be realized if prescribing clinicians make the medication changes prompted by PGx test results. This review examines the current evidence on the impact PGx testing has on hospital admissions and whether it prompts medication changes. A systematic search was performed in three databases (Medline, CINAHL and EMBASE) to search all the relevant studies published up to the year 2020, comparing hospitalization rates and medication changes amongst PGx tested patients with patients receiving treatment-as-usual (TAU). Data extracted from full texts were narratively synthesized using a process model developed from the included studies, to derive themes associated to a suggested workflow for PGx-guided care and its expected benefit for medications optimization and hospitalization. A meta-analysis was undertaken on all the studies that report the number of PGx tested patients that had medication change(s) and the number of PGx tested patients that were hospitalized, compared to participants that received TAU. The search strategy identified 5 hospitalization themed studies and 5 medication change themed studies for analysis. The meta-analysis showed that medication changes occurred significantly more frequently in the PGx tested arm across 4 of 5 studies. Meta-analysis showed that all-cause hospitalization occurred significantly less frequently in the PGx tested arm than the TAU. The results show proof of concept for the use of PGx in prescribing that produces patient benefit. However, the review also highlights the opportunities and evidence gaps that are important when considering the introduction of PGx into health systems; namely patient involvement in PGx prescribing decisions, thus a better understanding of the perspective of patients and prescribers. We highlight the opportunities and evidence gaps that are important when considering the introduction of PGx into health systems

    Globally optimal on-line learning rules for multi-layer neural networks

    Get PDF
    We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This rule maximizes the total reduction in generalization error over the whole learning process. A simple example demonstrates that the locally optimal rule, which maximizes the rate of decrease in generalization error, may perform poorly in comparison
    • …
    corecore