630 research outputs found

    Variational aleatoric uncertainty calibration in neural regression

    Full text link
    Des mesures de confiance calibrĂ©es et fiables sont un prĂ©requis pour la plupart des systĂšmes de perception robotique car elles sont nĂ©cessaires aux modules de fusion de capteurs et de planification qui interviennent plus en aval. Cela est particuliĂšrement vrai dans le cas d’applications oĂč la sĂ©curitĂ© est essentielle, comme les voitures Ă  conduite autonome. Dans le contexte de l’apprentissage profond, l’incertitude prĂ©dictive est classĂ©e en incertitude Ă©pistĂ©mique et incertitude alĂ©atoire. Il existe Ă©galement une incertitude distributionnelle associĂ©e aux donnĂ©es hors distribution. L’incertitude alĂ©atoire reprĂ©sente l’ambiguĂŻtĂ© inhĂ©rente aux donnĂ©es d’entrĂ©e et est gĂ©nĂ©ralement irrĂ©ductible par nature. Plusieurs mĂ©thodes existent pour estimer cette incertitude au moyen de structures de rĂ©seau modifiĂ©es ou de fonctions de perte. Cependant, en gĂ©nĂ©ral, ces mĂ©thodes manquent de calibration, ce qui signifie que les incertitudes estimĂ©es ne reprĂ©sentent pas fidĂšlement l’incertitude des donnĂ©es empiriques. Les approches actuelles pour calibrer l’incertitude alĂ©atoire nĂ©cessitent soit un "ensemble de donnĂ©es de calibration", soit de modifier les paramĂštres du modĂšle aprĂšs l’apprentissage. De plus, de nombreuses approches ajoutent des opĂ©rations supplĂ©mentaires lors de l’infĂ©rence. Pour pallier Ă  ces problĂšmes, nous proposons une mĂ©thode simple et efficace d’entraĂźnement d’un rĂ©gresseur neuronal calibrĂ©, conçue Ă  partir des premiers principes de la calibration. Notre idĂ©e maĂźtresse est que la calibration ne peut ĂȘtre rĂ©alisĂ©e qu’en imposant des contraintes sur plusieurs exemples, comme ceux d’un mini-batch, contrairement aux approches existantes qui n’imposent des contraintes que sur la base d’un Ă©chantillon. En obligeant la distribution des sorties du rĂ©gresseur neuronal (la distribution de la proposition) Ă  ressembler Ă  unedistribution cible en minimisant une divergence f , nous obtenons des modĂšles nettement mieuxcalibrĂ©s par rapport aux approches prĂ©cĂ©dentes. Notre approche, f -Cal, est simple Ă  mettre en Ɠuvre ou Ă  ajouter aux modĂšles existants et surpasse les mĂ©thodes de calibration existantes dansles tĂąches rĂ©elles Ă  grande Ă©chelle de dĂ©tection d’objets et d’estimation de la profondeur. f -Cal peut ĂȘtre mise en Ɠuvre en 10-15 lignes de code PyTorch et peut ĂȘtre intĂ©grĂ©e Ă  n’importe quel rĂ©gresseur neuronal probabiliste, de façon peu invasive. Nous explorons Ă©galement l’estimation de l’incertitude distributionnelle pour la dĂ©tection d’objets, et employons des mĂ©thodes conçues pour les systĂšmes de classification. Nous Ă©tablissons un problĂšme d’arriĂšre-plan hors distribution qui entrave l’applicabilitĂ© des mĂ©thodes d’incertitude distributionnelle dans la dĂ©tection d’objets.Calibrated and reliable confidence measures are a prerequisite for most robotics perception systems since they are needed by sensor fusion and planning components downstream. This is particularly true in the case of safety-critical applications such as self-driving cars. In the context of deep learning, the sources of predictive uncertainty are categorized into epistemic and aleatoric uncertainty. There is also distributional uncertainty associated with out of distribution data. Epistemic uncertainty, also known as knowledge uncertainty, arises because of noise in the model structure and parameters, and can be reduced with more labeled data. Aleatoric uncertainty represents the inherent ambiguity in the input data and is generally irreducible in nature. Several methods exist for estimating aleatoric uncertainty through modified network structures or loss functions. However, in general, these methods lack calibration, meaning that the estimated uncertainties do not represent the empirical data uncertainty accurately. Current approaches to calibrate aleatoric uncertainty either require a held out calibration dataset or to modify the model parameters post-training. Moreover, many approaches add extra computation during inference time. To alleviate these issues, this thesis proposes a simple and effective method for training a calibrated neural regressor, designed from the first principles of calibration. Our key insight is that calibration can be achieved by imposing constraints across multiple examples, such as those in a mini-batch, as opposed to existing approaches that only impose constraints on a per-sample basis. By enforcing the distribution of outputs of the neural regressor (the proposal distribution) to resemble a target distribution by minimizing an f-divergence, we obtain significantly better-calibrated models compared to prior approaches. Our approach, f-Cal, is simple to implement or add to existing models and outperforms existing calibration methods on the large-scale real-world tasks of object detection and depth estimation. f-Cal can be implemented in 10-15 lines of PyTorch code, and can be integrated with any probabilistic neural regressor in a minimally invasive way. This thesis also explores the estimation of distributional uncertainty for object detection, and employ methods designed for classification setups. In particular, we attempt to detect out of distribution (OOD) samples, examples which are not part of training data distribution. I establish a background-OOD problem which hampers applicability of distributional uncertainty methods in object detection specifically

    Thin and Deep Gaussian Processes

    Full text link
    Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn low-dimensional embeddings of the inputs that explain the output data. Following the architecture of deep neural networks, the most common deep GPs warp the input space layer-by-layer but lose all the interpretability of shallow GPs. An alternative construction is to successively parameterize the lengthscale of a kernel, improving the interpretability but ultimately giving away the notion of learning lower-dimensional embeddings. Unfortunately, both methods are susceptible to particular pathologies which may hinder fitting and limit their interpretability. This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer defines locally linear transformations of the original input data maintaining the concept of latent embeddings while also retaining the interpretation of lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces non-pathological manifolds that admit learning lower-dimensional representations. We show with theoretical and experimental results that i) TDGP is, unlike previous models, tailored to specifically discover lower-dimensional manifolds in the input data, ii) TDGP behaves well when increasing the number of layers, and iii) TDGP performs well in standard benchmark datasets.Comment: Accepted at the Conference on Neural Information Processing Systems (NeurIPS) 202

    Doubly Stochastic Variational Inference for Deep Gaussian Processes

    Get PDF
    Gaussian processes (GPs) are a good choice for function approximation as they are flexible, robust to over-fitting, and provide well-calibrated predictive uncertainty. Deep Gaussian processes (DGPs) are multi-layer generalisations of GPs, but inference in these models has proved challenging. Existing approaches to inference in DGP models assume approximate posteriors that force independence between the layers, and do not work well in practice. We present a doubly stochastic variational inference algorithm, which does not force independence between layers. With our method of inference we demonstrate that a DGP model can be used effectively on data ranging in size from hundreds to a billion points. We provide strong empirical evidence that our inference scheme for DGPs works well in practice in both classification and regression.Comment: NIPS 201

    Bayesian Image Quality Transfer with CNNs: Exploring Uncertainty in dMRI Super-Resolution

    Get PDF
    In this work, we investigate the value of uncertainty modeling in 3D super-resolution with convolutional neural networks (CNNs). Deep learning has shown success in a plethora of medical image transformation problems, such as super-resolution (SR) and image synthesis. However, the highly ill-posed nature of such problems results in inevitable ambiguity in the learning of networks. We propose to account for intrinsic uncertainty through a per-patch heteroscedastic noise model and for parameter uncertainty through approximate Bayesian inference in the form of variational dropout. We show that the combined benefits of both lead to the state-of-the-art performance SR of diffusion MR brain images in terms of errors compared to ground truth. We further show that the reduced error scores produce tangible benefits in downstream tractography. In addition, the probabilistic nature of the methods naturally confers a mechanism to quantify uncertainty over the super-resolved output. We demonstrate through experiments on both healthy and pathological brains the potential utility of such an uncertainty measure in the risk assessment of the super-resolved images for subsequent clinical use.Comment: Accepted paper at MICCAI 201
    • 

    corecore