630 research outputs found
Variational aleatoric uncertainty calibration in neural regression
Des mesures de confiance calibrĂ©es et fiables sont un prĂ©requis pour la plupart des systĂšmes de perception robotique car elles sont nĂ©cessaires aux modules de fusion de capteurs et de planification qui interviennent plus en aval. Cela est particuliĂšrement vrai dans le cas dâapplications oĂč la sĂ©curitĂ© est essentielle, comme les voitures Ă conduite autonome. Dans le contexte de lâapprentissage profond, lâincertitude prĂ©dictive est classĂ©e en incertitude Ă©pistĂ©mique et incertitude alĂ©atoire. Il existe Ă©galement une incertitude distributionnelle associĂ©e aux donnĂ©es hors distribution. Lâincertitude alĂ©atoire reprĂ©sente lâambiguĂŻtĂ© inhĂ©rente aux donnĂ©es dâentrĂ©e et est gĂ©nĂ©ralement irrĂ©ductible par nature. Plusieurs mĂ©thodes existent pour estimer cette incertitude au moyen de structures de rĂ©seau modifiĂ©es ou de fonctions de perte. Cependant, en gĂ©nĂ©ral, ces mĂ©thodes manquent de calibration, ce qui signifie que les incertitudes estimĂ©es ne reprĂ©sentent pas fidĂšlement lâincertitude des donnĂ©es empiriques. Les approches actuelles pour calibrer lâincertitude alĂ©atoire nĂ©cessitent soit un "ensemble de donnĂ©es de calibration", soit de modifier les paramĂštres du modĂšle aprĂšs lâapprentissage. De plus, de nombreuses approches ajoutent des opĂ©rations supplĂ©mentaires lors de lâinfĂ©rence. Pour pallier Ă ces problĂšmes, nous proposons une mĂ©thode simple et efficace dâentraĂźnement dâun rĂ©gresseur neuronal calibrĂ©, conçue Ă partir des premiers principes de la calibration. Notre idĂ©e maĂźtresse est que la calibration ne peut ĂȘtre rĂ©alisĂ©e quâen imposant des contraintes sur plusieurs exemples, comme ceux dâun mini-batch, contrairement aux approches existantes qui nâimposent des contraintes que sur la base dâun Ă©chantillon. En obligeant la distribution des sorties du rĂ©gresseur neuronal (la distribution de la proposition) Ă ressembler Ă unedistribution cible en minimisant une divergence f , nous obtenons des modĂšles nettement mieuxcalibrĂ©s par rapport aux approches prĂ©cĂ©dentes. Notre approche, f -Cal, est simple Ă mettre en Ćuvre ou Ă ajouter aux modĂšles existants et surpasse les mĂ©thodes de calibration existantes dansles tĂąches rĂ©elles Ă grande Ă©chelle de dĂ©tection dâobjets et dâestimation de la profondeur. f -Cal peut ĂȘtre mise en Ćuvre en 10-15 lignes de code PyTorch et peut ĂȘtre intĂ©grĂ©e Ă nâimporte quel rĂ©gresseur neuronal probabiliste, de façon peu invasive. Nous explorons Ă©galement lâestimation de lâincertitude distributionnelle pour la dĂ©tection dâobjets, et employons des mĂ©thodes conçues pour les systĂšmes de classification. Nous Ă©tablissons un problĂšme dâarriĂšre-plan hors distribution qui entrave lâapplicabilitĂ© des mĂ©thodes dâincertitude distributionnelle dans la dĂ©tection dâobjets.Calibrated and reliable confidence measures are a prerequisite for most robotics perception systems since they are needed by sensor fusion and planning components downstream. This is particularly true in the case of safety-critical applications such as self-driving cars. In the context of deep learning, the sources of predictive uncertainty are categorized into epistemic and aleatoric uncertainty. There is also distributional uncertainty associated with out of distribution data. Epistemic uncertainty, also known as knowledge uncertainty, arises because of noise in the model structure and parameters, and can be reduced with more labeled data. Aleatoric uncertainty represents the inherent ambiguity in the input data and is generally irreducible in nature. Several methods exist for estimating aleatoric uncertainty through modified network structures or loss functions. However, in general, these methods lack calibration, meaning that the estimated uncertainties do not represent the empirical data uncertainty accurately. Current approaches to calibrate aleatoric uncertainty either require a held out calibration dataset or to modify the model parameters post-training. Moreover, many approaches add extra computation during inference time. To alleviate these issues, this thesis proposes a simple and effective method for training a calibrated neural regressor, designed from the first principles of calibration. Our key insight is that calibration can be achieved by imposing constraints across multiple examples, such as those in a mini-batch, as opposed to existing approaches that only impose constraints on a per-sample basis. By enforcing the distribution of outputs of the neural regressor (the proposal distribution) to resemble a target distribution by minimizing an f-divergence, we obtain significantly better-calibrated models compared to prior approaches. Our approach, f-Cal, is simple to implement or add to existing models and outperforms existing calibration methods on the large-scale real-world tasks of object detection and depth estimation. f-Cal can be implemented in 10-15 lines of PyTorch code, and can be integrated with any probabilistic neural regressor in a minimally invasive way. This thesis also explores the estimation of distributional uncertainty for object detection, and employ methods designed for classification setups. In particular, we attempt to detect out of distribution (OOD) samples, examples which are not part of training data distribution. I establish a background-OOD problem which hampers applicability of distributional uncertainty methods in object detection specifically
Thin and Deep Gaussian Processes
Gaussian processes (GPs) can provide a principled approach to uncertainty
quantification with easy-to-interpret kernel hyperparameters, such as the
lengthscale, which controls the correlation distance of function values.
However, selecting an appropriate kernel can be challenging. Deep GPs avoid
manual kernel engineering by successively parameterizing kernels with GP
layers, allowing them to learn low-dimensional embeddings of the inputs that
explain the output data. Following the architecture of deep neural networks,
the most common deep GPs warp the input space layer-by-layer but lose all the
interpretability of shallow GPs. An alternative construction is to successively
parameterize the lengthscale of a kernel, improving the interpretability but
ultimately giving away the notion of learning lower-dimensional embeddings.
Unfortunately, both methods are susceptible to particular pathologies which may
hinder fitting and limit their interpretability. This work proposes a novel
synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer
defines locally linear transformations of the original input data maintaining
the concept of latent embeddings while also retaining the interpretation of
lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces
non-pathological manifolds that admit learning lower-dimensional
representations. We show with theoretical and experimental results that i) TDGP
is, unlike previous models, tailored to specifically discover lower-dimensional
manifolds in the input data, ii) TDGP behaves well when increasing the number
of layers, and iii) TDGP performs well in standard benchmark datasets.Comment: Accepted at the Conference on Neural Information Processing Systems
(NeurIPS) 202
Doubly Stochastic Variational Inference for Deep Gaussian Processes
Gaussian processes (GPs) are a good choice for function approximation as they
are flexible, robust to over-fitting, and provide well-calibrated predictive
uncertainty. Deep Gaussian processes (DGPs) are multi-layer generalisations of
GPs, but inference in these models has proved challenging. Existing approaches
to inference in DGP models assume approximate posteriors that force
independence between the layers, and do not work well in practice. We present a
doubly stochastic variational inference algorithm, which does not force
independence between layers. With our method of inference we demonstrate that a
DGP model can be used effectively on data ranging in size from hundreds to a
billion points. We provide strong empirical evidence that our inference scheme
for DGPs works well in practice in both classification and regression.Comment: NIPS 201
Bayesian Image Quality Transfer with CNNs: Exploring Uncertainty in dMRI Super-Resolution
In this work, we investigate the value of uncertainty modeling in 3D
super-resolution with convolutional neural networks (CNNs). Deep learning has
shown success in a plethora of medical image transformation problems, such as
super-resolution (SR) and image synthesis. However, the highly ill-posed nature
of such problems results in inevitable ambiguity in the learning of networks.
We propose to account for intrinsic uncertainty through a per-patch
heteroscedastic noise model and for parameter uncertainty through approximate
Bayesian inference in the form of variational dropout. We show that the
combined benefits of both lead to the state-of-the-art performance SR of
diffusion MR brain images in terms of errors compared to ground truth. We
further show that the reduced error scores produce tangible benefits in
downstream tractography. In addition, the probabilistic nature of the methods
naturally confers a mechanism to quantify uncertainty over the super-resolved
output. We demonstrate through experiments on both healthy and pathological
brains the potential utility of such an uncertainty measure in the risk
assessment of the super-resolved images for subsequent clinical use.Comment: Accepted paper at MICCAI 201
- âŠ