Search CORE

9 research outputs found

Dropout Distillation for Efficiently Estimating Model Confidence

Author: Bewley Alex
Gurau Corina
Posner Ingmar
Publication venue
Publication date: 01/01/2018
Field of study

We propose an efficient way to output better calibrated uncertainty scores from neural networks. The Distilled Dropout Network (DDN) makes standard (non-Bayesian) neural networks more introspective by adding a new training loss which prevents them from being overconfident. Our method is more efficient than Bayesian neural networks or model ensembles which, despite providing more reliable uncertainty scores, are more cumbersome to train and slower to test. We evaluate DDN on the the task of image classification on the CIFAR-10 dataset and show that our calibration results are competitive even when compared to 100 Monte Carlo samples from a dropout network while they also increase the classification accuracy. We also propose better calibration within the state of the art Faster R-CNN object detection framework and show, using the COCO dataset, that DDN helps train better calibrated object detectors

arXiv.org e-Print Archive

Oxford University Research Archive

Predicting and improving perception performance for robotics applications

Author: Corina Gurau
Publication venue
Publication date: 01/01/2018
Field of study

Perception systems are often the core component of a robotics framework as their ability to accurately interpret sensor data is essential for autonomy. The goal of this thesis is to estimate and improve the perception performance of a mobile robot across large areas of operation, particularly when there are no guarantees that the testing data distribution will match the training distribution. Such situations are prevalent for autonomous mobile robots operating outdoors under a variety of environmental conditions. This thesis explores the adaptability of vision systems by training place-specific models which outperform generic ones. We show that it is possible to train such models in a self-supervised fashion using geometric scene constraints without relying on costly image annotations. This thesis also explores the awareness that vision systems have of their own capability to make correct predictions at any given moment in time. We approach this problem from two different vantage points: firstly, through performance records which model perception performance as a function of location and appearance and, secondly, through intrinsic model uncertainty, or introspection as introduced by [Grimmett et al., 2016]. Performance records allow an autonomous agent to estimate the likelihood of making a mistake during future traversals of the same place. In a use-case scenario regarding offering or denying autonomy, we show that an agent is able to estimate when its confidence levels are low, deny autonomy, and reduce the number of perception mistakes made. Introspection refers to the ability of a model to associate an appropriate assessment of confidence with any test case. We introduce an efficient way to obtain well-calibrated and reliable uncertainty scores from neural networks. Our method is more computationally efficient than Bayesian neural networks or model ensembles which, despite being well-calibrated, are more cumbersome to train and slower to test. Additionally, we believe that we are the first to propose more introspective detectors within a state of the art object detection framework such as Faster R-CNN. This thesis proposes vision systems that are not only more accurate but also whose failures can be more reliably predicted. In doing so, we advocate practical solutions that often make use of tools specific to robotics such as additional sensing modalities or localisation maps pertaining to an autonomous vehicle, but we also touch upon machine learning techniques such as Bayesian deep learning. While striving for high accuracy remains a crucial endeavour, given the safetycritical nature of robot perception, we believe that estimating reliability, introspection, and diagnosing failure are indispensable when operating in cluttered, complex, and ever-changing environments.</p

Oxford University Research Archive

Probabilistic Future Prediction for Video Scene Understanding

Author: Cotter Fergal
Gurau Corina
Hu Anthony
Kendall Alex
Mohan Nikhil
Publication venue
Publication date: 17/07/2020
Field of study

We present a novel deep learning architecture for probabilistic future prediction from video. We predict the future semantics, geometry and motion of complex real-world urban scenes and use this representation to control an autonomous vehicle. This work is the first to jointly predict ego-motion, static scene, and the motion of dynamic agents in a probabilistic manner, which allows sampling consistent, highly probable futures from a compact latent space. Our model learns a representation from RGB video with a spatio-temporal convolutional module. The learned representation can be explicitly decoded to future semantic segmentation, depth, and optical flow, in addition to being an input to a learnt driving policy. To model the stochasticity of the future, we introduce a conditional variational approach which minimises the divergence between the present distribution (what could happen given what we have seen) and the future distribution (what we observe actually happens). During inference, diverse futures are generated by sampling from the present distribution.Toshiba Europe, grant G10045

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

HerbDisc: Towards Lifelong Robotic Object Discovery

Author: Alvaro Collet
Bo Xiong
Corina Gurau
Martial Hebert
Siddhartha S. Srinivasa
Publication venue
Publication date: 03/09/2014
Field of study

Our long-term goal is to develop a general solution to the Lifelong Robotic Object Discovery (LROD) problem: to discover new objects in the environment while the robot operates, for as long as the robot operates. In this paper, we consider the first step towards LROD: we automatically process the raw data stream of an entire workday of a robotic agent to discover objects. Our key contribution to achieve this goal is to incorporate domain knowledge—robotic metadata—in the discovery process, in addition to visual data. We propose a general graph-based formulation for LROD in which generic domain knowledge is encoded as constraints. To make long-term object discovery feasible, we encode into our formulation the natural constraints and non-visual sensory information in service robotics. A key advantage of our generic formulation is that we can add, modify

CiteSeerX

Exploiting Domain Knowledge for Object Discovery

Author: Alvaro Collet
Bo Xiong
Corina Gurau
Martial Hebert
Siddhartha S. Srinivasa
Publication venue
Publication date
Field of study

Abstract—In this paper, we consider the problem of Lifelong Robotic Object Discovery (LROD) as the long-term goal of discovering novel objects in the environment while the robot operates, for as long as the robot operates. As a first step towards LROD, we automatically process the raw video stream of an entire workday of a robotic agent to discover objects. We claim that the key to achieve this goal is to incorporate domain knowledge whenever available, in order to detect and adapt to changes in the environment. We propose a general graph-based formulation for LROD in which generic domain knowledge is encoded as constraints. Our formulation enables new sources of domain knowledge—metadata—to be added dynamically to the system, as they become available or as conditions change. By adding domain knowledge, we discover 2.7 × more objects and decrease processing time 190 times. Our optimized implementation, HerbDisc, processes 6 h 20 min of RGBD video of real human environments in 18 min 30 s, and discovers 121 correct novel objects with their 3D models. I

CiteSeerX

Crossref

Exploiting Domain Knowledge for Object Discovery

Author: Alvaro Collet Romea (5401919)
Bo Xiong (5402057)
Corina Gurau (5402054)
Martial Hebert (5398517)
Siddhartha Srinivasa (3894370)
Publication venue
Publication date: 29/06/2018
Field of study

<p>In this paper, we consider the problem of Lifelong Robotic Object Discovery (LROD) as the long-term goal of discovering novel objects in the environment while the robot operates, for as long as the robot operates. As a first step towards LROD, we automatically process the raw video stream of an entire workday of a robotic agent to discover objects. We claim that the key to achieve this goal is to incorporate domain knowledge whenever available, in order to detect and adapt to changes in the environment. We propose a general graph-based formulation for LROD in which generic domain knowledge is encoded as constraints. Our formulation enables new sources of domain knowledge —metadata— to be added dynamically to the system, as they become available or as conditions change. By adding domain knowledge, we discover 2.7x more objects and decrease processing time 190 times. Our optimized implementation, HerbDisc, processes 6 h 20 min of RGBD video of real human environments in 18 min 30 s, and discovers 121 correct novel objects with their 3D models.</p

FigShare

Learn from experience: Probabilistic prediction of perception performance to avoid failure

Author: Chi Hay Tong
Corina Gurău
Dushyant Rao
Dutta A
Ganin Y
Gurau C
Gurau C
Hawke J
Ingmar Posner
Ren S
Tzeng E
Yosinski J
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Recommended from our members

Model-Based Imitation Learning for Urban Driving

Author: Cipolla Roberto
Corrado Gianluca
Griffiths Nicolas
Gurau Corina
Hu Anthony
Kendall Alex
Murez Zak
Shotton Jamie
Yeo Hudson
Publication venue: Department of Engineering
Publication date: 03/01/2023
Field of study

An accurate model of the environment and the dynamic agents acting in it offers great potential for improving motion planning. We present MILE: a Model-based Imitation LEarning approach to jointly learn a model of the world and a policy for autonomous driving. Our method leverages 3D geometry as an inductive bias and learns a highly compact latent space directly from high-resolution videos of expert demonstrations. Our model is trained on an offline corpus of urban driving data, without any online interaction with the environment. MILE improves upon prior state-of-the-art by 31% in driving score on the CARLA simulator when deployed in a completely new town and new weather conditions. Our model can predict diverse and plausible states and actions, that can be interpretably decoded to bird's-eye view semantic segmentation. Further, we demonstrate that it can execute complex driving manoeuvres from plans entirely predicted in imagination. Our approach is the first camera-only method that models static scene, dynamic scene, and ego-behaviour in an urban driving environment. The code and model weights are available at https://github.com/wayveai/mile.Toshiba Europe grant G10045

Apollo (Cambridge)