8 research outputs found
Objective Evaluation of Deep Uncertainty Predictions for COVID-19 Detection
Deep neural networks (DNNs) have been widely applied for detecting COVID-19
in medical images. Existing studies mainly apply transfer learning and other
data representation strategies to generate accurate point estimates. The
generalization power of these networks is always questionable due to being
developed using small datasets and failing to report their predictive
confidence. Quantifying uncertainties associated with DNN predictions is a
prerequisite for their trusted deployment in medical settings. Here we apply
and evaluate three uncertainty quantification techniques for COVID-19 detection
using chest X-Ray (CXR) images. The novel concept of uncertainty confusion
matrix is proposed and new performance metrics for the objective evaluation of
uncertainty estimates are introduced. Through comprehensive experiments, it is
shown that networks pertained on CXR images outperform networks pretrained on
natural image datasets such as ImageNet. Qualitatively and quantitatively
evaluations also reveal that the predictive uncertainty estimates are
statistically higher for erroneous predictions than correct predictions.
Accordingly, uncertainty quantification methods are capable of flagging risky
predictions with high uncertainty estimates. We also observe that ensemble
methods more reliably capture uncertainties during the inference.Comment: 7 pages, 6 figures, 1 Table, 36 refrence
An Uncertainty-aware Transfer Learning-based Framework for Covid-19 Diagnosis
The early and reliable detection of COVID-19 infected patients is essential
to prevent and limit its outbreak. The PCR tests for COVID-19 detection are not
available in many countries and also there are genuine concerns about their
reliability and performance. Motivated by these shortcomings, this paper
proposes a deep uncertainty-aware transfer learning framework for COVID-19
detection using medical images. Four popular convolutional neural networks
(CNNs) including VGG16, ResNet50, DenseNet121, and InceptionResNetV2 are first
applied to extract deep features from chest X-ray and computed tomography (CT)
images. Extracted features are then processed by different machine learning and
statistical modelling techniques to identify COVID-19 cases. We also calculate
and report the epistemic uncertainty of classification results to identify
regions where the trained models are not confident about their decisions (out
of distribution problem). Comprehensive simulation results for X-ray and CT
image datasets indicate that linear support vector machine and neural network
models achieve the best results as measured by accuracy, sensitivity,
specificity, and AUC. Also it is found that predictive uncertainty estimates
are much higher for CT images compared to X-ray images.Comment: 9 pages, 9 figures, 3 table
A case for new neural network smoothness constraints
How sensitive should machine learning models be to input changes? We tackle
the question of model smoothness and show that it is a useful inductive bias
which aids generalization, adversarial robustness, generative modeling and
reinforcement learning. We explore current methods of imposing smoothness
constraints and observe they lack the flexibility to adapt to new tasks, they
don't account for data modalities, they interact with losses, architectures and
optimization in ways not yet fully understood. We conclude that new advances in
the field are hinging on finding ways to incorporate data, tasks and learning
into our definitions of smoothness
Bayesian Deep Ensembles via the Neural Tangent Kernel
We explore the link between deep ensembles and Gaussian processes (GPs)
through the lens of the Neural Tangent Kernel (NTK): a recent development in
understanding the training dynamics of wide neural networks (NNs). Previous
work has shown that even in the infinite width limit, when NNs become GPs,
there is no GP posterior interpretation to a deep ensemble trained with squared
error loss. We introduce a simple modification to standard deep ensembles
training, through addition of a computationally-tractable, randomised and
untrainable function to each ensemble member, that enables a posterior
interpretation in the infinite width limit. When ensembled together, our
trained NNs give an approximation to a posterior predictive distribution, and
we prove that our Bayesian deep ensembles make more conservative predictions
than standard deep ensembles in the infinite width limit. Finally, using finite
width NNs we demonstrate that our Bayesian deep ensembles faithfully emulate
the analytic posterior predictive when available, and can outperform standard
deep ensembles in various out-of-distribution settings, for both regression and
classification tasks
SAMBA: Safe Model-Based & Active Reinforcement Learning
In this paper, we propose SAMBA, a novel framework for safe reinforcement
learning that combines aspects from probabilistic modelling, information
theory, and statistics. Our method builds upon PILCO to enable active
exploration using novel(semi-)metrics for out-of-sample Gaussian process
evaluation optimised through a multi-objective problem that supports
conditional-value-at-risk constraints. We evaluate our algorithm on a variety
of safe dynamical system benchmarks involving both low and high-dimensional
state representations. Our results show orders of magnitude reductions in
samples and violations compared to state-of-the-art methods. Lastly, we provide
intuition as to the effectiveness of the framework by a detailed analysis of
our active metrics and safety constraints
Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness
Bayesian neural networks (BNN) and deep ensembles are principled approaches
to estimate the predictive uncertainty of a deep learning model. However their
practicality in real-time, industrial-scale applications are limited due to
their heavy memory and inference cost. This motivates us to study principled
approaches to high-quality uncertainty estimation that require only a single
deep neural network (DNN). By formalizing the uncertainty quantification as a
minimax learning problem, we first identify input distance awareness, i.e., the
model's ability to quantify the distance of a testing example from the training
data in the input space, as a necessary condition for a DNN to achieve
high-quality (i.e., minimax optimal) uncertainty estimation. We then propose
Spectral-normalized Neural Gaussian Process (SNGP), a simple method that
improves the distance-awareness ability of modern DNNs, by adding a weight
normalization step during training and replacing the output layer with a
Gaussian process. On a suite of vision and language understanding tasks and on
modern architectures (Wide-ResNet and BERT), SNGP is competitive with deep
ensembles in prediction, calibration and out-of-domain detection, and
outperforms the other single-model approaches
Understanding Softmax Confidence and Uncertainty
It is often remarked that neural networks fail to increase their uncertainty
when predicting on data far from the training distribution. Yet naively using
softmax confidence as a proxy for uncertainty achieves modest success in tasks
exclusively testing for this, e.g., out-of-distribution (OOD) detection. This
paper investigates this contradiction, identifying two implicit biases that do
encourage softmax confidence to correlate with epistemic uncertainty: 1)
Approximately optimal decision boundary structure, and 2) Filtering effects of
deep networks. It describes why low-dimensional intuitions about softmax
confidence are misleading. Diagnostic experiments quantify reasons softmax
confidence can fail, finding that extrapolations are less to blame than overlap
between training and OOD data in final-layer representations.
Pre-trained/fine-tuned networks reduce this overlap
Advances in Electron Microscopy with Deep Learning
This doctoral thesis covers some of my advances in electron microscopy with
deep learning. Highlights include a comprehensive review of deep learning in
electron microscopy; large new electron microscopy datasets for machine
learning, dataset search engines based on variational autoencoders, and
automatic data clustering by t-distributed stochastic neighbour embedding;
adaptive learning rate clipping to stabilize learning; generative adversarial
networks for compressed sensing with spiral, uniformly spaced and other fixed
sparse scan paths; recurrent neural networks trained to piecewise adapt sparse
scan paths to specimens by reinforcement learning; improving signal-to-noise;
and conditional generative adversarial networks for exit wavefunction
reconstruction from single transmission electron micrographs. This thesis adds
to my publications by presenting their relationships, reflections, and holistic
conclusions. This copy of my thesis is typeset for online dissemination to
improve readability, whereas the thesis submitted to the University of Warwick
in support of my application for the degree of Doctor of Philosophy in Physics
will be typeset for physical printing and binding.Comment: 295 pages, phd thesis, 100 figures + 12 tables, papers are compresse