2,066 research outputs found
A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods
We establish the first mathematically rigorous link between Bayesian,
variational Bayesian, and ensemble methods. A key step towards this it to
reformulate the non-convex optimisation problem typically encountered in deep
learning as a convex optimisation in the space of probability measures. On a
technical level, our contribution amounts to studying generalised variational
inference through the lense of Wasserstein gradient flows. The result is a
unified theory of various seemingly disconnected approaches that are commonly
used for uncertainty quantification in deep learning -- including deep
ensembles and (variational) Bayesian methods. This offers a fresh perspective
on the reasons behind the success of deep ensembles over procedures based on
parameterised variational inference, and allows the derivation of new
ensembling schemes with convergence guarantees. We showcase this by proposing a
family of interacting deep ensembles with direct parallels to the interactions
of particle systems in thermodynamics, and use our theory to prove the
convergence of these algorithms to a well-defined global minimiser on the space
of probability measures
Some Open Problems in Random Matrix Theory and the Theory of Integrable Systems. II
We describe a list of open problems in random matrix theory and the theory of
integrable systems that was presented at the conference Asymptotics in
Integrable Systems, Random Matrices and Random Processes and Universality,
Centre de Recherches Mathematiques, Montreal, June 7-11, 2015. We also describe
progress that has been made on problems in an earlier list presented by the
author on the occasion of his 60th birthday in 2005 (see [Deift P., Contemp.
Math., Vol. 458, Amer. Math. Soc., Providence, RI, 2008, 419-430,
arXiv:0712.0849]).Comment: for Part I see arXiv:0712.084
Repulsive Deep Ensembles are Bayesian
Deep ensembles have recently gained popularity in the deep learning community
for their conceptual simplicity and efficiency. However, maintaining functional
diversity between ensemble members that are independently trained with gradient
descent is challenging. This can lead to pathologies when adding more ensemble
members, such as a saturation of the ensemble performance, which converges to
the performance of a single model. Moreover, this does not only affect the
quality of its predictions, but even more so the uncertainty estimates of the
ensemble, and thus its performance on out-of-distribution data. We hypothesize
that this limitation can be overcome by discouraging different ensemble members
from collapsing to the same function. To this end, we introduce a kernelized
repulsive term in the update rule of the deep ensembles. We show that this
simple modification not only enforces and maintains diversity among the members
but, even more importantly, transforms the maximum a posteriori inference into
proper Bayesian inference. Namely, we show that the training dynamics of our
proposed repulsive ensembles follow a Wasserstein gradient flow of the KL
divergence with the true posterior. We study repulsive terms in weight and
function space and empirically compare their performance to standard ensembles
and Bayesian baselines on synthetic and real-world prediction tasks
A study of blow-ups in the Keller-Segel model of chemotaxis
We study the Keller-Segel model of chemotaxis and develop a composite
particle-grid numerical method with adaptive time stepping which allows us to
accurately resolve singular solutions. The numerical findings (in two
dimensions) are then compared with analytical predictions regarding formation
and interaction of singularities obtained via analysis of the stochastic
differential equations associated with the Keller-Segel model
- …