975 research outputs found
Uncertainty in multitask learning: joint representations for probabilistic MR-only radiotherapy planning
Multi-task neural network architectures provide a mechanism that jointly
integrates information from distinct sources. It is ideal in the context of
MR-only radiotherapy planning as it can jointly regress a synthetic CT (synCT)
scan and segment organs-at-risk (OAR) from MRI. We propose a probabilistic
multi-task network that estimates: 1) intrinsic uncertainty through a
heteroscedastic noise model for spatially-adaptive task loss weighting and 2)
parameter uncertainty through approximate Bayesian inference. This allows
sampling of multiple segmentations and synCTs that share their network
representation. We test our model on prostate cancer scans and show that it
produces more accurate and consistent synCTs with a better estimation in the
variance of the errors, state of the art results in OAR segmentation and a
methodology for quality assurance in radiotherapy treatment planning.Comment: Early-accept at MICCAI 2018, 8 pages, 4 figure
Mind the nuisance: Gaussian process classification using privileged noise
The learning with privileged information setting has recently attracted a lot of attention within the machine learning community, as it allows the integration of additional knowledge into the training process of a classifier, even when this comes in the form of a data modality that is not available at test time. Here, we show that privileged information can naturally be treated as noise in the latent function of a Gaussian process classifier (GPC). That is, in contrast to the standard GPC setting, the latent function is not just a nuisance but a feature: it becomes a natural measure of confidence about the training data by modulating the slope of the GPC probit likelihood function. Extensive experiments on public datasets show that the proposed GPC method using privileged noise, called GPC+, improves over a standard GPC without privileged knowledge, and also over the current state-of-the-art SVM-based method, SVM+. Moreover, we show that advanced neural networks and deep learning methods can be compressed as privileged information
Understanding and Comparing Scalable Gaussian Process Regression for Big Data
As a non-parametric Bayesian model which produces informative predictive
distribution, Gaussian process (GP) has been widely used in various fields,
like regression, classification and optimization. The cubic complexity of
standard GP however leads to poor scalability, which poses challenges in the
era of big data. Hence, various scalable GPs have been developed in the
literature in order to improve the scalability while retaining desirable
prediction accuracy. This paper devotes to investigating the methodological
characteristics and performance of representative global and local scalable GPs
including sparse approximations and local aggregations from four main
perspectives: scalability, capability, controllability and robustness. The
numerical experiments on two toy examples and five real-world datasets with up
to 250K points offer the following findings. In terms of scalability, most of
the scalable GPs own a time complexity that is linear to the training size. In
terms of capability, the sparse approximations capture the long-term spatial
correlations, the local aggregations capture the local patterns but suffer from
over-fitting in some scenarios. In terms of controllability, we could improve
the performance of sparse approximations by simply increasing the inducing
size. But this is not the case for local aggregations. In terms of robustness,
local aggregations are robust to various initializations of hyperparameters due
to the local attention mechanism. Finally, we highlight that the proper hybrid
of global and local scalable GPs may be a promising way to improve both the
model capability and scalability for big data.Comment: 25 pages, 15 figures, preprint submitted to KB
Probably Unknown: Deep Inverse Sensor Modelling In Radar
Radar presents a promising alternative to lidar and vision in autonomous
vehicle applications, able to detect objects at long range under a variety of
weather conditions. However, distinguishing between occupied and free space
from raw radar power returns is challenging due to complex interactions between
sensor noise and occlusion.
To counter this we propose to learn an Inverse Sensor Model (ISM) converting
a raw radar scan to a grid map of occupancy probabilities using a deep neural
network. Our network is self-supervised using partial occupancy labels
generated by lidar, allowing a robot to learn about world occupancy from past
experience without human supervision. We evaluate our approach on five hours of
data recorded in a dynamic urban environment. By accounting for the scene
context of each grid cell our model is able to successfully segment the world
into occupied and free space, outperforming standard CFAR filtering approaches.
Additionally by incorporating heteroscedastic uncertainty into our model
formulation, we are able to quantify the variance in the uncertainty throughout
the sensor observation. Through this mechanism we are able to successfully
identify regions of space that are likely to be occluded.Comment: 6 full pages, 1 page of reference
Deterministic variational inference for robust Bayesian neural networks
Bayesian neural networks (BNNs) hold great promise as a flexible and
principled solution to deal with uncertainty when learning from finite data.
Among approaches to realize probabilistic inference in deep neural networks,
variational Bayes (VB) is theoretically grounded, generally applicable, and
computationally efficient. With wide recognition of potential advantages, why
is it that variational Bayes has seen very limited practical use for BNNs in
real applications? We argue that variational inference in neural networks is
fragile: successful implementations require careful initialization and tuning
of prior variances, as well as controlling the variance of Monte Carlo gradient
estimates. We provide two innovations that aim to turn VB into a robust
inference tool for Bayesian neural networks: first, we introduce a novel
deterministic method to approximate moments in neural networks, eliminating
gradient variance; second, we introduce a hierarchical prior for parameters and
a novel Empirical Bayes procedure for automatically selecting prior variances.
Combining these two innovations, the resulting method is highly efficient and
robust. On the application of heteroscedastic regression we demonstrate good
predictive performance over alternative approaches
- …