Search CORE

10 research outputs found

Ab-initio solution of the many-electron Schrödinger equation with deep neural networks

Author: Foulkes WMC
Matthews AGDG
Pfau D
Spencer JS
Publication venue: 'American Physical Society (APS)'
Publication date: 06/08/2020
Field of study

Given access to accurate solutions of the many-electron Schr\"odinger equation, nearly all chemistry could be derived from first principles. Exact wavefunctions of interesting chemical systems are out of reach because they are NP-hard to compute in general, but approximations can be found using polynomially-scaling algorithms. The key challenge for many of these algorithms is the choice of wavefunction approximation, or Ansatz, which must trade off between efficiency and accuracy. Neural networks have shown impressive power as accurate practical function approximators and promise as a compact wavefunction Ansatz for spin systems, but problems in electronic structure require wavefunctions that obey Fermi-Dirac statistics. Here we introduce a novel deep learning architecture, the Fermionic Neural Network, as a powerful wavefunction Ansatz for many-electron systems. The Fermionic Neural Network is able to achieve accuracy beyond other variational quantum Monte Carlo Ans\"atze on a variety of atoms and small molecules. Using no data other than atomic positions and charges, we predict the dissociation curves of the nitrogen molecule and hydrogen chain, two challenging strongly-correlated systems, to significantly higher accuracy than the coupled cluster method, widely considered the most accurate scalable method for quantum chemistry at equilibrium geometry. This demonstrates that deep neural networks can improve the accuracy of variational quantum Monte Carlo to the point where it outperforms other ab-initio quantum chemistry methods, opening the possibility of accurate direct optimisation of wavefunctions for previously intractable molecules and solids

Spiral - Imperial College Digital Repository

Variational Bayesian dropout: pitfalls and fixes

Author: Ghahramani Z
Matthews AGDG
Publication venue: Proceedings of Machine Learning Research
Publication date
Field of study

Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretation is in providing a theoretical framework useful for analysing and extending the algorithm. We show that the proposed framework suffers from several issues; from undefined or pathological behaviour of the true posterior related to use of improper priors, to an ill-defined variational objective due to singularity of the approximating distribution relative to the true posterior. Our analysis of the improper log uniform prior used in variational Gaussian dropout suggests the pathologies are generally irredeemable, and that the algorithm still works only because the variational formulation annuls some of the pathologies. To address the singularity issue, we proffer Quasi-KL (QKL) divergence, a new approximate inference objective for approximation of high-dimensional distributions. We show that motivations for variational Bernoulli dropout based on discretisation and noise have QKL as a limit. Properties of QKL are studied both theoretically and on a simple practical example which shows that the QKL-optimal approximation of a full rank Gaussian with a degenerate one naturally leads to the Principal Component Analysis solution

CUED - Cambridge University Engineering Department

Classification using log Gaussian Cox processes

Author: Ghahramani Z
Matthews AGDG
Publication venue
Publication date
Field of study

McCullagh and Yang (2006) suggest a family of classification algorithms based on Cox processes. We further investigate the log Gaussian variant which has a number of appealing properties. Conditioned on the covariates, the distribution over labels is given by a type of conditional Markov random field. In the supervised case, computation of the predictive probability of a single test point scales linearly with the number of training points and the multiclass generalization is straightforward. We show new links between the supervised method and classical nonparametric methods. We give a detailed analysis of the pairwise graph representable Markov random field, which we use to extend the model to semi-supervised learning problems, and propose an inference method based on graph min-cuts. We give the first experimental analysis on supervised and semi-supervised datasets and show good empirical performance

CUED - Cambridge University Engineering Department

Annealed Flow Transport Monte Carlo

Author: Arbel M
Doucet A
Matthews AGDG
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2021
Field of study

Annealed Importance Sampling (AIS) and its Sequential Monte Carlo (SMC) extensions are state-of-the-art methods for estimating normalizing constants of probability distributions. We propose here a novel Monte Carlo algorithm, Annealed Flow Transport (AFT), that builds upon AIS and SMC and combines them with normalizing flows (NFs) for improved performance. This method transports a set of particles using not only importance sampling (IS), Markov chain Monte Carlo (MCMC) and resampling steps - as in SMC, but also relies on NFs which are learned sequentially to push particles towards the successive annealed targets. We provide limit theorems for the resulting Monte Carlo estimates of the normalizing constant and expectations with respect to the target distribution. Additionally, we show that a continuous-time scaling limit of the population version of AFT is given by a Feynman–Kac measure which simplifies to the law of a controlled diffusion for expressive NFs. We demonstrate experimentally the benefits and limitations of our methodology on a variety of applications

Oxford University Research Archive

Adversarial Examples, Uncertainty, and Transfer Testing Robustness in Gaussian Process Hybrid Deep Networks

Author: Bradshaw J
Ghahramani Z
Matthews AGDG
Publication venue
Publication date
Field of study

Deep neural networks (DNNs) have excellent representative power and are state of the art classifiers on many tasks. However, they often do not capture their own uncertainties well making them less robust in the real world as they overconfidently extrapolate and do not notice domain shift. Gaussian processes (GPs) with RBF kernels on the other hand have better calibrated uncertainties and do not overconfidently extrapolate far from data in their training set. However, GPs have poor representational power and do not perform as well as DNNs on complex domains. In this paper we show that GP hybrid deep networks, GPDNNs, (GPs on top of DNNs and trained end-to-end) inherit the nice properties of both GPs and DNNs and are much more robust to adversarial examples. When extrapolating to adversarial examples and testing in domain shift settings, GPDNNs frequently output high entropy class probabilities corresponding to essentially "don't know". GPDNNs are therefore promising as deep architectures that know when they don't know

CUED - Cambridge University Engineering Department

Variational Gaussian Dropout is not Bayesian

Author: Ghahramani Z
Hron J
Matthews AGDG
Publication venue
Publication date
Field of study

Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks. A recent paper reinterpreted the technique as a specific algorithm for approximate inference in Bayesian neural networks; several extensions ensued. We show that the log-uniform prior used in all the above publications does not generally induce a proper posterior, and thus Bayesian inference in such models is ill-posed. Independent of the log-uniform prior, the correlated weight noise approximation has further issues leading to either infinite objective or high risk of overfitting. The above implies that the reported sparsity of obtained solutions cannot be explained by Bayesian or the related minimum description length arguments. We thus study the objective from a non-Bayesian perspective, provide its previously unknown analytical form which allows exact gradient evaluation, and show that the later proposed additive reparametrisation introduces minima not present in the original multiplicative parametrisation. Implications and future research directions are discussed

CUED - Cambridge University Engineering Department

Score-based diffusion meets annealed importance sampling

Author: Doucet A
Grathwohl G
Matthews AGDG
Strathmann H
Publication venue: Curran Associates, Inc.
Publication date: 01/01/2022
Field of study

Oxford University Research Archive

Continual repeated annealed flow transport Monte Carlo

Author: Arbel M
Doucet A
Matthews AGDG
Rezende DJ
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2022
Field of study

We propose Continual Repeated Annealed Flow Transport Monte Carlo (CRAFT), a method that combines a sequential Monte Carlo (SMC) sampler (itself a generalization of Annealed Importance Sampling) with variational inference using normalizing flows. The normalizing flows are directly trained to transport between annealing temperatures using a KL divergence for each transition. This optimization objective is itself estimated using the normalizing flow/SMC approximation. We show conceptually and using multiple empirical examples that CRAFT improves on Annealed Flow Transport Monte Carlo (Arbel et al., 2021), on which it builds and also on Markov chain Monte Carlo (MCMC) based Stochastic Normalizing Flows (Wu et al., 2020). By incorporating CRAFT within particle MCMC, we show that such learnt samplers can achieve impressively accurate results on a challenging lattice field theory example

Oxford University Research Archive

Gaussian process inference modelling of dynamic robot control for expressive piano playing

Author: A Lim
AGdG Matthews
B Fang
Cheryn Ng
CK Williams
Fumiya Iida
I Kato
J Hughes
J Hughes
KP Körding
KP Murphy
L Jen-Chang
L Scimeca
Li Wen
Luca Scimeca
M Titsias
RB Gillespie
U Çulha
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref

Neural network analysis of sleep stages enables efficient diagnosis of narcolepsy

Author: A Esteva
A Goldbart
A Krogh
A Moscovitch
A Vassalli
AGDG Matthews
AV Olsen
AW MacLean
B Frauscher
B Hjorth
BA Mander
BE Bejnordi
BR Kornum
BT Polyak
C Peyron
CE Rasmussen
DSW Ting
E Mignot
F Han
F Pizza
G Hinton
G Luca
H Danker-Hopfe
HI Moore
I Goodfellow
I Guyon
International Xyrem Study Group.
J Reiter
J Shotton
JAE Christensen
JAE Christensen
JB Jensen
JZ Cheng
K Hornik
M Ronzhina
MH Hansen
MH Silber
MR Littner
N Srivastava
O Andlauer
O Andlauer
P Anderer
P Drakatos
P Lakhani
PC Mahalanobis
R Boostani
Raman K. Malhotra
RS Rosenberg
S Hochreiter
S Ioffe
S Subramanian
SC Hong
ST Kuna
T Lajnef
T Roth
T Young
TLT Silveira da
V Gulshan
X Zhang
Y Dauvilliers
Y Kim
Y LeCun
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref