3,000 research outputs found
Transferable neural networks for enhanced sampling of protein dynamics
Variational auto-encoder frameworks have demonstrated success in reducing
complex nonlinear dynamics in molecular simulation to a single non-linear
embedding. In this work, we illustrate how this non-linear latent embedding can
be used as a collective variable for enhanced sampling, and present a simple
modification that allows us to rapidly perform sampling in multiple related
systems. We first demonstrate our method is able to describe the effects of
force field changes in capped alanine dipeptide after learning a model using
AMBER99. We further provide a simple extension to variational dynamics encoders
that allows the model to be trained in a more efficient manner on larger
systems by encoding the outputs of a linear transformation using time-structure
based independent component analysis (tICA). Using this technique, we show how
such a model trained for one protein, the WW domain, can efficiently be
transferred to perform enhanced sampling on a related mutant protein, the GTT
mutation. This method shows promise for its ability to rapidly sample related
systems using a single transferable collective variable and is generally
applicable to sets of related simulations, enabling us to probe the effects of
variation in increasingly large systems of biophysical interest.Comment: 20 pages, 10 figure
Lightweight Probabilistic Deep Networks
Even though probabilistic treatments of neural networks have a long history,
they have not found widespread use in practice. Sampling approaches are often
too slow already for simple networks. The size of the inputs and the depth of
typical CNN architectures in computer vision only compound this problem.
Uncertainty in neural networks has thus been largely ignored in practice,
despite the fact that it may provide important information about the
reliability of predictions and the inner workings of the network. In this
paper, we introduce two lightweight approaches to making supervised learning
with probabilistic deep networks practical: First, we suggest probabilistic
output layers for classification and regression that require only minimal
changes to existing networks. Second, we employ assumed density filtering and
show that activation uncertainties can be propagated in a practical fashion
through the entire network, again with minor changes. Both probabilistic
networks retain the predictive power of the deterministic counterpart, but
yield uncertainties that correlate well with the empirical error induced by
their predictions. Moreover, the robustness to adversarial examples is
significantly increased.Comment: To appear at CVPR 201
Deep Variational Reinforcement Learning for POMDPs
Many real-world sequential decision making problems are partially observable
by nature, and the environment model is typically unknown. Consequently, there
is great need for reinforcement learning methods that can tackle such problems
given only a stream of incomplete and noisy observations. In this paper, we
propose deep variational reinforcement learning (DVRL), which introduces an
inductive bias that allows an agent to learn a generative model of the
environment and perform inference in that model to effectively aggregate the
available information. We develop an n-step approximation to the evidence lower
bound (ELBO), allowing the model to be trained jointly with the policy. This
ensures that the latent state representation is suitable for the control task.
In experiments on Mountain Hike and flickering Atari we show that our method
outperforms previous approaches relying on recurrent neural networks to encode
the past
Matrix Completion With Variational Graph Autoencoders: Application in Hyperlocal Air Quality Inference
Inferring air quality from a limited number of observations is an essential
task for monitoring and controlling air pollution. Existing inference methods
typically use low spatial resolution data collected by fixed monitoring
stations and infer the concentration of air pollutants using additional types
of data, e.g., meteorological and traffic information. In this work, we focus
on street-level air quality inference by utilizing data collected by mobile
stations. We formulate air quality inference in this setting as a graph-based
matrix completion problem and propose a novel variational model based on graph
convolutional autoencoders. Our model captures effectively the spatio-temporal
correlation of the measurements and does not depend on the availability of
additional information apart from the street-network topology. Experiments on a
real air quality dataset, collected with mobile stations, shows that the
proposed model outperforms state-of-the-art approaches
Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models
The interpretation of complex high-dimensional data typically requires the
use of dimensionality reduction techniques to extract explanatory
low-dimensional representations. However, in many real-world problems these
representations may not be sufficient to aid interpretation on their own, and
it would be desirable to interpret the model in terms of the original features
themselves. Our goal is to characterise how feature-level variation depends on
latent low-dimensional representations, external covariates, and non-linear
interactions between the two. In this paper, we propose to achieve this through
a structured kernel decomposition in a hybrid Gaussian Process model which we
call the Covariate Gaussian Process Latent Variable Model (c-GPLVM). We
demonstrate the utility of our model on simulated examples and applications in
disease progression modelling from high-dimensional gene expression data in the
presence of additional phenotypes. In each setting we show how the c-GPLVM can
extract low-dimensional structures from high-dimensional data sets whilst
allowing a breakdown of feature-level variability that is not present in other
commonly used dimensionality reduction approaches
Recommended from our members
Advances in Probabilistic Modelling: Sparse Gaussian Processes, Autoencoders, and Few-shot Learning
Learning is the ability to generalise beyond training examples; but because many generalisations are consistent with a given set of observations, all machine learning methods rely on inductive biases to select certain generalisations over others. This thesis explores how the model structure
and priors affect the inductiven biases of probabilistic models, and our ability to learn and make inferences from data.
Specifically we present theoretical analyses alongside algorithmic and modelling advances in three areas of probabilistic machine learning: sparse Gaussian process approximations and invariant covariance functions, learning flexible priors for variational autoencoders, and probabilistic approaches for few-shot learning. As inference is rarely tractable, we discuss variational inference methods as a secondary theme.
First, we disentangle the theoretical properties and optimisation behaviour
of two widely used sparse Gaussian process approximations. We conclude that a variational free energy approximation is more principled and extensible and should be used in practice despite
potential optimisation difficulties. We then discuss how general symmetries and invariances can be integrated into Gaussian process priors and can be learned using the marginal likelihood. To make inference tractable, we develop a variational inference scheme that uses unbiased estimates of intractable covariance functions.
We then address the mismatch between aggregate posteriors and priors in variational autoencoders and propose a mechanism to define flexible distributions using a form of rejection sampling. We use this approach to define a more flexible prior distribution on the latent space of a variational autoencoder, which generalises to unseen test data and reduces the number of low quality samples from the model in a practical way.
Finally, we propose two probabilistic approaches to few-shot learning that achieve state of the art results on benchmarks, building on multi-task probabilistic models with adaptive classifier heads. Our first approach combines a pre-trained deep feature extractor with a simple probabilistic
model for the head, and can be linked to automatically regularised softmax regression. The second employs an amortised head model; it can be viewed to meta-learn probabilistic inference for prediction, and can be generalised to other contexts such as few-shot regression.UK Engineering and Physics Research Council (EPSRC) DTA, Qualcomm Studentship in Technology, Max Planck Societ
- …