2,065 research outputs found

    A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods

    Full text link
    We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the lense of Wasserstein gradient flows. The result is a unified theory of various seemingly disconnected approaches that are commonly used for uncertainty quantification in deep learning -- including deep ensembles and (variational) Bayesian methods. This offers a fresh perspective on the reasons behind the success of deep ensembles over procedures based on parameterised variational inference, and allows the derivation of new ensembling schemes with convergence guarantees. We showcase this by proposing a family of interacting deep ensembles with direct parallels to the interactions of particle systems in thermodynamics, and use our theory to prove the convergence of these algorithms to a well-defined global minimiser on the space of probability measures

    Some Open Problems in Random Matrix Theory and the Theory of Integrable Systems. II

    Full text link
    We describe a list of open problems in random matrix theory and the theory of integrable systems that was presented at the conference Asymptotics in Integrable Systems, Random Matrices and Random Processes and Universality, Centre de Recherches Mathematiques, Montreal, June 7-11, 2015. We also describe progress that has been made on problems in an earlier list presented by the author on the occasion of his 60th birthday in 2005 (see [Deift P., Contemp. Math., Vol. 458, Amer. Math. Soc., Providence, RI, 2008, 419-430, arXiv:0712.0849]).Comment: for Part I see arXiv:0712.084

    Repulsive Deep Ensembles are Bayesian

    Full text link
    Deep ensembles have recently gained popularity in the deep learning community for their conceptual simplicity and efficiency. However, maintaining functional diversity between ensemble members that are independently trained with gradient descent is challenging. This can lead to pathologies when adding more ensemble members, such as a saturation of the ensemble performance, which converges to the performance of a single model. Moreover, this does not only affect the quality of its predictions, but even more so the uncertainty estimates of the ensemble, and thus its performance on out-of-distribution data. We hypothesize that this limitation can be overcome by discouraging different ensemble members from collapsing to the same function. To this end, we introduce a kernelized repulsive term in the update rule of the deep ensembles. We show that this simple modification not only enforces and maintains diversity among the members but, even more importantly, transforms the maximum a posteriori inference into proper Bayesian inference. Namely, we show that the training dynamics of our proposed repulsive ensembles follow a Wasserstein gradient flow of the KL divergence with the true posterior. We study repulsive terms in weight and function space and empirically compare their performance to standard ensembles and Bayesian baselines on synthetic and real-world prediction tasks

    A study of blow-ups in the Keller-Segel model of chemotaxis

    Full text link
    We study the Keller-Segel model of chemotaxis and develop a composite particle-grid numerical method with adaptive time stepping which allows us to accurately resolve singular solutions. The numerical findings (in two dimensions) are then compared with analytical predictions regarding formation and interaction of singularities obtained via analysis of the stochastic differential equations associated with the Keller-Segel model
    corecore