10 research outputs found

    Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

    Full text link
    Most progress in semantic segmentation reports on daytime images taken under favorable illumination conditions. We instead address the problem of semantic segmentation of nighttime images and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation models from day to night via labeled synthetic images and unlabeled real images, both for progressively darker times of day, which exploits cross-time-of-day correspondences for the real images to guide the inference of their labels; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, designed for adverse conditions and including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, which comprises 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 151 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark to perform our novel evaluation. Experiments show that our guided curriculum adaptation significantly outperforms state-of-the-art methods on real nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can lead to better results on data with ambiguous content such as our nighttime benchmark and profit safety-oriented applications which involve invalid inputs.Comment: ICCV 2019 camera-read

    Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

    Full text link
    We address the problem of semantic nighttime image segmentation and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation models from day to night through progressively darker times of day, exploiting cross-time-of-day correspondences between daytime images from a reference map and dark images to guide the label inference in the dark domains; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, comprising 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 201 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark for our novel evaluation. Experiments show that our map-guided curriculum adaptation significantly outperforms state-of-the-art methods on nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can improve results on data with ambiguous content such as our benchmark and profit safety-oriented applications involving invalid inputs.Comment: IEEE T-PAMI 202

    Switching GAN-based Image Filters to Improve Perception for Autonomous Driving

    Get PDF
    Autonomous driving holds the potential to increase human productivity, reduce accidents caused by human errors, allow better utilization of roads, reduce traffic accidents and congestion, free up parking space and provide many other advantages. Perception of Autonomous Vehicles (AV) refers to the use of sensors to perceive the world, e.g. using cameras to detect and classify objects. Traffic scene understanding is a key research problem in perception in autonomous driving, and semantic segmentation is a useful method to address this problem. Adverse weather conditions are a reality that AV must contend with. Conditions like rain, snow, haze, etc. can drastically reduce visibility and thus affect computer vision models. Models for perception for AVs are currently designed for and tested on predominantly ideal weather conditions under good illumination. The most complete solution may be to have the segmentation networks be trained on all possible adverse conditions. Thus a dataset to train a segmentation network to make it robust to rain would need to have adequate data that cover these conditions well. Moreover, labeling is an expensive task. It is particularly expensive for semantic segmentation, as each object in a scene needs to be identified and each pixel annotated in the right class. Thus, the adverse weather is a challenging problem for perception models in AVs. This thesis explores the use of Generative Adversarial Networks (GAN) in order to improve semantic segmentation. We design a framework and a methodology to evaluate the proposed approach. The framework consists of an Adversity Detector, and a series of denoising filters. The Adversity Detector is an image classifier that takes as input clear weather or adverse weather scenes, and attempts to predict whether the given image contains rain, or puddles, or other conditions that can adversely affect semantic segmentation. The filters are denoising generative adversarial networks that are trained to remove the adverse conditions from images in order to translate the image to a domain the segmentation network has been trained on, i.e. clear weather images. We use the prediction from the Adversity Detector to choose which GAN filter to use. The methodology we devise for evaluating our approach uses the trained filters to output sets of images that we can then run segmentation tasks on. This, we argue, is a better metric for evaluating the GANs than similarity measures such as SSIM. We also use synthetic data so we can perform systematic evaluation of our technique. We train two kinds of GANs, one that uses paired data (CycleGAN), and one that does not (Pix2Pix). We have concluded that GAN architectures that use unpaired data are not sufficiently good models for denoising. We train the denoising filters using the other architecture and we found them easy to train, and they show good results. While these filters do not show better performance than when we train our segmentation network with adverse weather data, we refer back to the point that training the segmentation network requires labelled data which is expensive to collect and annotate, particularly for adverse weather and lighting conditions. We implement our proposed framework and report a 17\% increase in performance in segmentation over the baseline results obtained when we do not use our framework

    {SHIFT}: {A} Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation

    Get PDF
    Adapting to a continuously evolving environment is a safety-critical challenge inevitably faced by all autonomous-driving systems. Existing image- and video-based driving datasets, however, fall short of capturing the mutable nature of the real world. In this paper, we introduce the largest multi-task synthetic dataset for autonomous driving, SHIFT. It presents discrete and continuous shifts in cloudiness, rain and fog intensity, time of day, and vehicle and pedestrian density. Featuring a comprehensive sensor suite and annotations for several mainstream perception tasks, SHIFT allows to investigate how a perception systems' performance degrades at increasing levels of domain shift, fostering the development of continuous adaptation strategies to mitigate this problem and assessing the robustness and generality of a model. Our dataset and benchmark toolkit are publicly available at www.vis.xyz/shift

    REITS: Reflective Surface for Intelligent Transportation Systems

    Full text link
    Autonomous vehicles are predicted to dominate the transportation industry in the foreseeable future. Safety is one of the major challenges to the early deployment of self-driving systems. To ensure safety, self-driving vehicles must sense and detect humans, other vehicles, and road infrastructure accurately, robustly, and timely. However, existing sensing techniques used by self-driving vehicles may not be absolutely reliable. In this paper, we design REITS, a system to improve the reliability of RF-based sensing modules for autonomous vehicles. We conduct theoretical analysis on possible failures of existing RF-based sensing systems. Based on the analysis, REITS adopts a multi-antenna design, which enables constructive blind beamforming to return an enhanced radar signal in the incident direction. REITS can also let the existing radar system sense identification information by switching between constructive beamforming state and destructive beamforming state. Preliminary results show that REITS improves the detection distance of a self-driving car radar by a factor of 3.63

    Towards Synthetic Dataset Generation for Semantic Segmentation Networks

    Get PDF
    Recent work in semantic segmentation research for autonomous vehicles has shifted towards multimodal techniques. The driving factor behind this is a lack of reliable and ample ground truth annotation data of real-world adverse weather and lighting conditions. Human labeling of such adverse conditions is oftentimes erroneous and very expensive. However, it is a worthwhile endeavour to identify ways to make unimodal semantic segmentation networks more robust. It encourages cost reduction through reduced reliance on sensor fusion. Also, a more robust unimodal network can be used towards multimodal techniques for increased overall system performance. The objective of this thesis is to converge upon a synthetic dataset generation method and testing framework that is conducive towards rapid validation of unimodal semantic segmentation network architectures. We explore multiple avenues of synthetic dataset generation. Insights gained through these explorations guide us towards designing the ProcSy method. ProcSy consists of a procedurally-created, virtual replica of a real-world operational design domain around the city of Waterloo, Ontario. Ground truth annotations, depth, and occlusion data can be produced in real-time. The ProcSy method generates repeatable scenes with quantifiable variations of adverse weather and lighting conditions. We demonstrate experiments using the ProcSy method on DeepLab v3+, a state-of-the-art network for unimodal semantic segmentation tasks. We gain insights about the behaviour of DeepLab on unseen adverse weather conditions. Based on empirical testing, we identify optimization techniques towards data collection for robustly training the network

    Bayesian Deep Learning and Uncertainty in Computer Vision

    Get PDF
    Visual data contains rich information about the operating environment of an intelligent robotic system. Extracting this information allows intelligent systems to reason and decide their future actions. Erroneous visual information, therefore, can lead to poor decisions, causing accidents and casualties, especially in a safety-critical application such as automated driving. One way to prevent this is by measuring the level of uncertainty in the visual information interpretation, so that the system knows the reliability degree of the extracted information. Deep neural networks are now being used in many vision tasks due to their superior accuracy compared to traditional machine learning methods. However, their estimated uncertainties have been shown to be unreliable. To mitigate this issue, researchers have developed methods and tools to apply Bayesian modeling to deep neural networks. This results in a class of models known as Bayesian neural networks, whose uncertainty estimates are more reliable and informative. In this thesis, we make the following contributions in the context of Bayesian Neural Network applied to vision tasks. In particular: - We improve the understanding of visual uncertainty estimates from Bayesian deep models. Specifically, we study the behavior of Bayesian deep models applied to road-scene image segmentation under different factors, such as varying weather, depth, and occlusion levels. - We show the importance of model calibration technique in the context of autonomous driving, which strengthens the reliability of the estimated uncertainty. We demonstrate its effectiveness in a simple object localization task. - We address the high run-time cost of the current Bayesian deep learning techniques. We develop a distillation technique based on the Dirichlet distribution, which allows us to estimate the uncertainties in real-time
    corecore