6 research outputs found
Nonreversible MCMC from conditional invertible transforms:a complete recipe with convergence guarantees
Markov Chain Monte Carlo (MCMC) is a class of algorithms to sample complex
and high-dimensional probability distributions. The Metropolis-Hastings (MH)
algorithm, the workhorse of MCMC, provides a simple recipe to construct
reversible Markov kernels. Reversibility is a tractable property that implies a
less tractable but essential property here, invariance. Reversibility is
however not necessarily desirable when considering performance. This has
prompted recent interest in designing kernels breaking this property. At the
same time, an active stream of research has focused on the design of novel
versions of the MH kernel, some nonreversible, relying on the use of complex
invertible deterministic transforms. While standard implementations of the MH
kernel are well understood, the aforementioned developments have not received
the same systematic treatment to ensure their validity. This paper fills the
gap by developing general tools to ensure that a class of nonreversible Markov
kernels, possibly relying on complex transforms, has the desired invariance
property and leads to convergent algorithms. This leads to a set of simple and
practically verifiable conditions
Monte Carlo Variational Auto-Encoders
Variational auto-encoders (VAE) are popular deep latent variable models which
are trained by maximizing an Evidence Lower Bound (ELBO). To obtain tighter
ELBO and hence better variational approximations, it has been proposed to use
importance sampling to get a lower variance estimate of the evidence. However,
importance sampling is known to perform poorly in high dimensions. While it has
been suggested many times in the literature to use more sophisticated
algorithms such as Annealed Importance Sampling (AIS) and its Sequential
Importance Sampling (SIS) extensions, the potential benefits brought by these
advanced techniques have never been realized for VAE: the AIS estimate cannot
be easily differentiated, while SIS requires the specification of carefully
chosen backward Markov kernels. In this paper, we address both issues and
demonstrate the performance of the resulting Monte Carlo VAEs on a variety of
applications
Nonparametric Uncertainty Quantification for Single Deterministic Neural Network
This paper proposes a fast and scalable method for uncertainty quantification
of machine learning models' predictions. First, we show the principled way to
measure the uncertainty of predictions for a classifier based on
Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
Importantly, the proposed approach allows to disentangle explicitly aleatoric
and epistemic uncertainties. The resulting method works directly in the feature
space. However, one can apply it to any neural network by considering an
embedding of the data induced by the network. We demonstrate the strong
performance of the method in uncertainty estimation tasks on text
classification problems and a variety of real-world image datasets, such as
MNIST, SVHN, CIFAR-100 and several versions of ImageNet.Comment: NeurIPS 2022 pape
FedPop: A Bayesian Approach for Personalised Federated Learning
Personalised federated learning (FL) aims at collaboratively learning a machine learning model taylored for each client. Albeit promising advances have been made in this direction, most of existing approaches works do not allow for uncertainty quantification which is crucial in many applications. In addition, personalisation in the cross-device setting still involves important issues, especially for new clients or those having small number of observations. This paper aims at filling these gaps. To this end, we propose a novel methodology coined FedPop by recasting personalised FL into the population modeling paradigm where clients' models involve fixed common population parameters and random effects, aiming at explaining data heterogeneity. To derive convergence guarantees for our scheme, we introduce a new class of federated stochastic optimisation algorithms which relies on Markov chain Monte Carlo methods. Compared to existing personalised FL methods, the proposed methodology has important benefits: it is robust to client drift, practical for inference on new clients, and above all, enables uncertainty quantification under mild computational and memory overheads. We provide non-asymptotic convergence guarantees for the proposed algorithms and illustrate their performances on various personalised federated learning tasks