24 research outputs found
Pixel-wise Smoothing for Certified Robustness against Camera Motion Perturbations
In recent years, computer vision has made remarkable advancements in
autonomous driving and robotics. However, it has been observed that deep
learning-based visual perception models lack robustness when faced with camera
motion perturbations. The current certification process for assessing
robustness is costly and time-consuming due to the extensive number of image
projections required for Monte Carlo sampling in the 3D camera motion space. To
address these challenges, we present a novel, efficient, and practical
framework for certifying the robustness of 3D-2D projective transformations
against camera motion perturbations. Our approach leverages a smoothing
distribution over the 2D pixel space instead of in the 3D physical space,
eliminating the need for costly camera motion sampling and significantly
enhancing the efficiency of robustness certifications. With the pixel-wise
smoothed classifier, we are able to fully upper bound the projection errors
using a technique of uniform partitioning in camera motion space. Additionally,
we extend our certification framework to a more general scenario where only a
single-frame point cloud is required in the projection oracle. This is achieved
by deriving Lipschitz-based approximated partition intervals. Through extensive
experimentation, we validate the trade-off between effectiveness and efficiency
enabled by our proposed method. Remarkably, our approach achieves approximately
80% certified accuracy while utilizing only 30% of the projected image frames.Comment: 32 pages, 5 figures, 13 table
Investigating the Impact of Multi-LiDAR Placement on Object Detection for Autonomous Driving
The past few years have witnessed an increasing interest in improving the
perception performance of LiDARs on autonomous vehicles. While most of the
existing works focus on developing new deep learning algorithms or model
architectures, we study the problem from the physical design perspective, i.e.,
how different placements of multiple LiDARs influence the learning-based
perception. To this end, we introduce an easy-to-compute information-theoretic
surrogate metric to quantitatively and fast evaluate LiDAR placement for 3D
detection of different types of objects. We also present a new data collection,
detection model training and evaluation framework in the realistic CARLA
simulator to evaluate disparate multi-LiDAR configurations. Using several
prevalent placements inspired by the designs of self-driving companies, we show
the correlation between our surrogate metric and object detection performance
of different representative algorithms on KITTI through extensive experiments,
validating the effectiveness of our LiDAR placement evaluation approach. Our
results show that sensor placement is non-negligible in 3D point cloud-based
object detection, which will contribute up to 10% performance discrepancy in
terms of average precision in challenging 3D object detection settings. We
believe that this is one of the first studies to quantitatively investigate the
influence of LiDAR placement on perception performance. The code is available
on https://github.com/HanjiangHu/Multi-LiDAR-Placement-for-3D-Detection.Comment: CVPR 2022 camera-ready version:15 pages, 14 figures, 9 table
SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments
Different environments pose a great challenge to the outdoor robust visual
perception for long-term autonomous driving and the generalization of
learning-based algorithms on different environmental effects is still an open
problem. Although monocular depth prediction has been well studied recently,
there is few work focusing on the robust learning-based depth prediction across
different environments, e.g. changing illumination and seasons, owing to the
lack of such a multi-environment real-world dataset and benchmark. To this end,
the first cross-season monocular depth prediction dataset and benchmark
SeasonDepth is built based on CMU Visual Localization dataset. To benchmark the
depth estimation performance under different environments, we investigate
representative and recent state-of-the-art open-source supervised,
self-supervised and domain adaptation depth prediction methods from KITTI
benchmark using several newly-formulated metrics. Through extensive
experimental evaluation on the proposed dataset, the influence of multiple
environments on performance and robustness is analyzed qualitatively and
quantitatively, showing that the long-term monocular depth prediction is still
challenging even with fine-tuning. We further give promising avenues that
self-supervised training and stereo geometry constraint help to enhance the
robustness to changing environments. The dataset is available on
https://seasondepth.github.io, and benchmark toolkit is available on
https://github.com/SeasonDepth/SeasonDepth.Comment: 19 pages, 13 figure
Retrieval-based Localization Based on Domain-invariant Feature Learning under Changing Environments
Visual localization is a crucial problem in mobile robotics and autonomous
driving. One solution is to retrieve images with known pose from a database for
the localization of query images. However, in environments with drastically
varying conditions (e.g. illumination changes, seasons, occlusion, dynamic
objects), retrieval-based localization is severely hampered and becomes a
challenging problem. In this paper, a novel domain-invariant feature learning
method (DIFL) is proposed based on ComboGAN, a multi-domain image translation
network architecture. By introducing a feature consistency loss (FCL) between
the encoded features of the original image and translated image in another
domain, we are able to train the encoders to generate domain-invariant features
in a self-supervised manner. To retrieve a target image from the database, the
query image is first encoded using the encoder belonging to the query domain to
obtain a domain-invariant feature vector. We then preform retrieval by
selecting the database image with the most similar domain-invariant feature
vector. We validate the proposed approach on the CMU-Seasons dataset, where we
outperform state-of-the-art learning-based descriptors in retrieval-based
localization for high and medium precision scenarios.Comment: Accepted by 2019 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS 2019
Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving
Cameras and LiDARs are both important sensors for autonomous driving, playing
critical roles in 3D object detection. Camera-LiDAR Fusion has been a prevalent
solution for robust and accurate driving perception. In contrast to the vast
majority of existing arts that focus on how to improve the performance of 3D
target detection through cross-modal schemes, deep learning algorithms, and
training tricks, we devote attention to the impact of sensor configurations on
the performance of learning-based methods. To achieve this, we propose a
unified information-theoretic surrogate metric for camera and LiDAR evaluation
based on the proposed sensor perception model. We also design an accelerated
high-quality framework for data acquisition, model training, and performance
evaluation that functions with the CARLA simulator. To show the correlation
between detection performance and our surrogate metrics, We conduct experiments
using several camera-LiDAR placements and parameters inspired by self-driving
companies and research institutions. Extensive experimental results of
representative algorithms on nuScenes dataset validate the effectiveness of our
surrogate metric, demonstrating that sensor configurations significantly impact
point-cloud-image fusion based detection models, which contribute up to 30%
discrepancy in terms of the average precision
RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions
Depth estimation from monocular images is pivotal for real-world visual
perception systems. While current learning-based depth estimation models train
and test on meticulously curated data, they often overlook out-of-distribution
(OoD) situations. Yet, in practical settings -- especially safety-critical ones
like autonomous driving -- common corruptions can arise. Addressing this
oversight, we introduce a comprehensive robustness test suite, RoboDepth,
encompassing 18 corruptions spanning three categories: i) weather and lighting
conditions; ii) sensor failures and movement; and iii) data processing
anomalies. We subsequently benchmark 42 depth estimation models across indoor
and outdoor scenes to assess their resilience to these corruptions. Our
findings underscore that, in the absence of a dedicated robustness evaluation
framework, many leading depth estimation models may be susceptible to typical
corruptions. We delve into design considerations for crafting more robust depth
estimation models, touching upon pre-training, augmentation, modality, model
capacity, and learning paradigms. We anticipate our benchmark will establish a
foundational platform for advancing robust OoD depth estimation.Comment: NeurIPS 2023; 45 pages, 25 figures, 13 tables; Code at
https://github.com/ldkong1205/RoboDept
SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles
As shown by recent studies, machine intelligence-enabled systems are
vulnerable to test cases resulting from either adversarial manipulation or
natural distribution shifts. This has raised great concerns about deploying
machine learning algorithms for real-world applications, especially in the
safety-critical domains such as autonomous driving (AD). On the other hand,
traditional AD testing on naturalistic scenarios requires hundreds of millions
of driving miles due to the high dimensionality and rareness of the
safety-critical scenarios in the real world. As a result, several approaches
for autonomous driving evaluation have been explored, which are usually,
however, based on different simulation platforms, types of safety-critical
scenarios, scenario generation algorithms, and driving route variations. Thus,
despite a large amount of effort in autonomous driving testing, it is still
challenging to compare and understand the effectiveness and efficiency of
different testing scenario generation algorithms and testing mechanisms under
similar conditions. In this paper, we aim to provide the first unified platform
SafeBench to integrate different types of safety-critical testing scenarios,
scenario generation algorithms, and other variations such as driving routes and
environments. Meanwhile, we implement 4 deep reinforcement learning-based AD
algorithms with 4 types of input (e.g., bird's-eye view, camera) to perform
fair comparisons on SafeBench. We find our generated testing scenarios are
indeed more challenging and observe the trade-off between the performance of AD
agents under benign and safety-critical testing scenarios. We believe our
unified platform SafeBench for large-scale and effective autonomous driving
testing will motivate the development of new testing scenario generation and
safe AD algorithms. SafeBench is available at https://safebench.github.io