23 research outputs found
Variational learning for inverse problems
Machine learning methods for solving inverse problems require uncertainty estimation to be reliable in real settings. While deep variational models offer a computationally tractable way of recovering complex uncertainties, they need large supervised data volumes to be trained, which in many practical applications requires prohibitively expensive collections with specific instruments. This thesis introduces two novel frameworks to train variational inference models for inverse problems, in semi-supervised and unsupervised settings respectively. In the former, a realistic scenario is considered, where few experimentally collected supervised data are available, and analytical models from domain expertise and existing unsupervised data sets are leveraged in addition to solve inverse problems in a semi-supervised fashion. This minimises the supervised data collection requirements and allows the training of effective probabilistic recovery models relatively inexpensively. This novel method is first evaluated in quantitative simulated experiments, testing performance in various controlled settings and compared to alternative techniques. The framework is then implemented in several real world applications, spanning imaging, astronomy and human-computer interaction. In each real world setting, the novel technique makes use of all available information for training, whether this is simulations, data or both, depending on the task. In each experimental scenario, state of the art recovery and uncertainty estimation were demonstrated with reasonably limited experimental collection efforts for training. The second framework presented in this thesis approaches instead the challenging unsupervised situation, where no examples of ground-truths are available. This type of inverse problem is commonly encountered in data pre-processing and information retrieval. A variational framework is designed to capture the solution space of inverse problem by using solely an estimate of the observation process and large ensembles of observations examples. The unsupervised framework is tested on data recovery tasks under the common setting of missing values and noise, demonstrating superior performance to existing variational methods for imputation and de-noising with different real data sets. Furthermore, higher classification accuracy after imputation are shown, proving the advantage of propagating uncertainty to downstream tasks with the new model
Variational Sparse Coding
Unsupervised discovery of interpretable features and controllable generation with highdimensional data are currently major challenges in machine learning, with applications
in data visualisation, clustering and artificial
data synthesis. We propose a model based
on variational auto-encoders (VAEs) in which
interpretation is induced through latent space
sparsity with a mixture of Spike and Slab distributions as prior. We derive an evidence
lower bound for this model and propose a specific training method for recovering disentangled features as sparse elements in latent vectors. In our experiments, we demonstrate superior disentanglement performance to standard
VAE approaches when an estimate of the number of true sources of variation is not available
and objects display different combinations of
attributes. Furthermore, the new model provides unique capabilities, such as recovering
feature exploitation, synthesising samples that
share attributes with a given input object and
controlling both discrete and continuous features upon generation
Variational Sparse Coding
Unsupervised discovery of interpretable features and controllable generation with highdimensional data are currently major challenges in machine learning, with applications
in data visualisation, clustering and artificial
data synthesis. We propose a model based
on variational auto-encoders (VAEs) in which
interpretation is induced through latent space
sparsity with a mixture of Spike and Slab distributions as prior. We derive an evidence
lower bound for this model and propose a specific training method for recovering disentangled features as sparse elements in latent vectors. In our experiments, we demonstrate superior disentanglement performance to standard
VAE approaches when an estimate of the number of true sources of variation is not available
and objects display different combinations of
attributes. Furthermore, the new model provides unique capabilities, such as recovering
feature exploitation, synthesising samples that
share attributes with a given input object and
controlling both discrete and continuous features upon generation
The role of late photons in diffuse optical imaging
The ability to image through turbid media such as organic tissues, is a
highly attractive prospect for biological and medical imaging. This is
challenging however, due to the highly scattering properties of tissues which
scramble the image information. The earliest photons that arrive at the
detector are often associated with ballistic transmission, whilst the later
photons are associated with complex paths due to multiple independent
scattering events and are therefore typically considered to be detrimental to
the final image formation process. In this work we report on the importance of
these highly diffuse, "late" photons for computational time-of-flight diffuse
optical imaging. In thick scattering materials, >80 transport mean free paths,
we provide evidence that including late photons in the inverse retrieval
enhances the image reconstruction quality. We also show that the late photons
alone have sufficient information to retrieve images of a similar quality to
early photon gated data. This result emphasises the importance in the strongly
diffusive regime discussed here, of fully time-resolved imaging techniques.Comment: 17 pages, 5 figure
Detection and tracking of moving objects hidden from view
The ability to detect motion and track a moving object hidden around a corner or behind a wall provides a crucial advantage when physically going around the obstacle is impossible or dangerous. Previous methods have demonstrated that it is possible to reconstruct the shape of an object hidden from view. However, these methods do not enable the tracking of movement in real time. We demonstrate a compact non-line-of-sight laser ranging technology that relies on the ability to send light around an obstacle using a scattering floor and then detect the return signal from a hidden object within only a few seconds of acquisition time. By detecting this signal with a single-photon avalanche diode (SPAD) camera, we follow the movement of an object located a metre away from the camera with centimetre precision. We discuss the possibility of applying this technology to a variety of real-life situations in the near future
Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data
We propose a new probabilistic method for unsupervised recovery of corrupted
data. Given a large ensemble of degraded samples, our method recovers accurate
posteriors of clean values, allowing the exploration of the manifold of
possible reconstructed data and hence characterising the underlying
uncertainty. In this setting, direct application of classical variational
methods often gives rise to collapsed densities that do not adequately explore
the solution space. Instead, we derive our novel reduced entropy condition
approximate inference method that results in rich posteriors. We test our model
in a data recovery task under the common setting of missing values and noise,
demonstrating superior performance to existing variational methods for
imputation and de-noising with different real data sets. We further show higher
classification accuracy after imputation, proving the advantage of propagating
uncertainty to downstream tasks with our model.Comment: 8+12 page
Variational inference for computational imaging inverse problems
Machine learning methods for computational imaging require uncertainty estimation to be reliable in real settings. While Bayesian models offer a computationally tractable way of recovering uncertainty, they need large data volumes to be trained, which in imaging applications implicates prohibitively expensive collections with specific imaging instruments. This paper introduces a novel framework to train variational inference for inverse problems exploiting in combination few experimentally collected data, domain expertise and existing image data sets. In such a way, Bayesian machine learning models can solve imaging inverse problems with minimal data collection efforts. Extensive simulated experiments show the advantages of the proposed framework. The approach is then applied to two real experimental optics settings: holographic image reconstruction and imaging through highly scattering media. In both settings, state of the art reconstructions are achieved with little collection of training data
Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy
Gravitational wave (GW) detection is now commonplace and as the sensitivity
of the global network of GW detectors improves, we will observe
s of transient GW events per year. The current methods used
to estimate their source parameters employ optimally sensitive but
computationally costly Bayesian inference approaches where typical analyses
have taken between 6 hours and 5 days. For binary neutron star and neutron star
black hole systems prompt counterpart electromagnetic (EM) signatures are
expected on timescales of 1 second -- 1 minute and the current fastest method
for alerting EM follow-up observers, can provide estimates in
minute, on a limited range of key source parameters. Here we show that a
conditional variational autoencoder pre-trained on binary black hole signals
can return Bayesian posterior probability estimates. The training procedure
need only be performed once for a given prior parameter space and the resulting
trained machine can then generate samples describing the posterior distribution
orders of magnitude faster than existing techniques.Comment: 13 pages, 5 figure
Rethinking Semi-supervised Learning with Language Models
Semi-supervised learning (SSL) is a popular setting aiming to effectively
utilize unlabelled data to improve model performance in downstream natural
language processing (NLP) tasks. Currently, there are two popular approaches to
make use of unlabelled data: Self-training (ST) and Task-adaptive pre-training
(TAPT). ST uses a teacher model to assign pseudo-labels to the unlabelled data,
while TAPT continues pre-training on the unlabelled data before fine-tuning. To
the best of our knowledge, the effectiveness of TAPT in SSL tasks has not been
systematically studied, and no previous work has directly compared TAPT and ST
in terms of their ability to utilize the pool of unlabelled data. In this
paper, we provide an extensive empirical study comparing five state-of-the-art
ST approaches and TAPT across various NLP tasks and data sizes, including in-
and out-of-domain settings. Surprisingly, we find that TAPT is a strong and
more robust SSL learner, even when using just a few hundred unlabelled samples
or in the presence of domain shifts, compared to more sophisticated ST
approaches, and tends to bring greater improvements in SSL than in
fully-supervised settings. Our further analysis demonstrates the risks of using
ST approaches when the size of labelled or unlabelled data is small or when
domain shifts exist. We offer a fresh perspective for future SSL research,
suggesting the use of unsupervised pre-training objectives over dependency on
pseudo labels