4,791 research outputs found
Benchmarking the Robustness of Panoptic Segmentation for Automated Driving
Precise situational awareness is required for the safe decision-making of
assisted and automated driving (AAD) functions. Panoptic segmentation is a
promising perception technique to identify and categorise objects, impending
hazards, and driveable space at a pixel level. While segmentation quality is
generally associated with the quality of the camera data, a comprehensive
understanding and modelling of this relationship are paramount for AAD system
designers. Motivated by such a need, this work proposes a unifying pipeline to
assess the robustness of panoptic segmentation models for AAD, correlating it
with traditional image quality. The first step of the proposed pipeline
involves generating degraded camera data that reflects real-world noise
factors. To this end, 19 noise factors have been identified and implemented
with 3 severity levels. Of these factors, this work proposes novel models for
unfavourable light and snow. After applying the degradation models, three
state-of-the-art CNN- and vision transformers (ViT)-based panoptic segmentation
networks are used to analyse their robustness. The variations of the
segmentation performance are then correlated to 8 selected image quality
metrics. This research reveals that: 1) certain specific noise factors produce
the highest impact on panoptic segmentation, i.e. droplets on lens and Gaussian
noise; 2) the ViT-based panoptic segmentation backbones show better robustness
to the considered noise factors; 3) some image quality metrics (i.e. LPIPS and
CW-SSIM) correlate strongly with panoptic segmentation performance and
therefore they can be used as predictive metrics for network performance
Dynamic Objects Segmentation for Visual Localization in Urban Environments
Visual localization and mapping is a crucial capability to address many
challenges in mobile robotics. It constitutes a robust, accurate and
cost-effective approach for local and global pose estimation within prior maps.
Yet, in highly dynamic environments, like crowded city streets, problems arise
as major parts of the image can be covered by dynamic objects. Consequently,
visual odometry pipelines often diverge and the localization systems
malfunction as detected features are not consistent with the precomputed 3D
model. In this work, we present an approach to automatically detect dynamic
object instances to improve the robustness of vision-based localization and
mapping in crowded environments. By training a convolutional neural network
model with a combination of synthetic and real-world data, dynamic object
instance masks are learned in a semi-supervised way. The real-world data can be
collected with a standard camera and requires minimal further post-processing.
Our experiments show that a wide range of dynamic objects can be reliably
detected using the presented method. Promising performance is demonstrated on
our own and also publicly available datasets, which also shows the
generalization capabilities of this approach.Comment: 4 pages, submitted to the IROS 2018 Workshop "From Freezing to
Jostling Robots: Current Challenges and New Paradigms for Safe Robot
Navigation in Dense Crowds
- …