Search CORE

26 research outputs found

Limitations of the Empirical Fisher Approximation for Natural Gradient Descent

Author: Balles Lukas
Hennig Philipp
Kunstner Frederik
Publication venue
Publication date: 01/06/2020
Field of study

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.Comment: V3: Minor corrections (typographic errors

arXiv.org e-Print Archive

MPG.PuRe

Estimating Model Uncertainty of Neural Networks in Sparse Information Form

Author: Feng Jianxiang
Humt Matthias
Lee Jongseok
Triebel Rudolph
Publication venue: Proceedings of Machine Learning Research
Publication date: 13/07/2020
Field of study

We present a sparse representation of model uncertainty for Deep Neural Networks (DNNs) where the parameter posterior is approximated with an inverse formulation of the Multivariate Normal Distribution (MND), also known as the information form. The key insight of our work is that the information matrix, i.e. the inverse of the covariance matrix tends to be sparse in its spectrum. Therefore, dimensionality reduction techniques such as low rank approximations (LRA) can be effectively exploited. To achieve this, we develop a novel sparsification algorithm and derive a cost-effective analytical sampler. As a result, we show that the information form can be scalably applied to represent model uncertainty in DNNs. Our exhaustive theoretical analysis and empirical evaluations on various benchmarks show the competitiveness of our approach over the current methods

Institute of Transport Research:Publications