2,252 research outputs found
Did You Miss the Sign? A False Negative Alarm System for Traffic Sign Detectors
Object detection is an integral part of an autonomous vehicle for its
safety-critical and navigational purposes. Traffic signs as objects play a
vital role in guiding such systems. However, if the vehicle fails to locate any
critical sign, it might make a catastrophic failure. In this paper, we propose
an approach to identify traffic signs that have been mistakenly discarded by
the object detector. The proposed method raises an alarm when it discovers a
failure by the object detector to detect a traffic sign. This approach can be
useful to evaluate the performance of the detector during the deployment phase.
We trained a single shot multi-box object detector to detect traffic signs and
used its internal features to train a separate false negative detector (FND).
During deployment, FND decides whether the traffic sign detector (TSD) has
missed a sign or not. We are using precision and recall to measure the accuracy
of FND in two different datasets. For 80% recall, FND has achieved 89.9%
precision in Belgium Traffic Sign Detection dataset and 90.8% precision in
German Traffic Sign Recognition Benchmark dataset respectively. To the best of
our knowledge, our method is the first to tackle this critical aspect of false
negative detection in robotic vision. Such a fail-safe mechanism for object
detection can improve the engagement of robotic vision systems in our daily
life.Comment: Submitted to the 2019 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS 2019
VIENA2: A Driving Anticipation Dataset
Action anticipation is critical in scenarios where one needs to react before
the action is finalized. This is, for instance, the case in automated driving,
where a car needs to, e.g., avoid hitting pedestrians and respect traffic
lights. While solutions have been proposed to tackle subsets of the driving
anticipation tasks, by making use of diverse, task-specific sensors, there is
no single dataset or framework that addresses them all in a consistent manner.
In this paper, we therefore introduce a new, large-scale dataset, called
VIENA2, covering 5 generic driving scenarios, with a total of 25 distinct
action classes. It contains more than 15K full HD, 5s long videos acquired in
various driving conditions, weathers, daytimes and environments, complemented
with a common and realistic set of sensor measurements. This amounts to more
than 2.25M frames, each annotated with an action label, corresponding to 600
samples per action class. We discuss our data acquisition strategy and the
statistics of our dataset, and benchmark state-of-the-art action anticipation
techniques, including a new multi-modal LSTM architecture with an effective
loss function for action anticipation in driving scenarios.Comment: Accepted in ACCV 201
Robustness of multimodal 3D object detection using deep learning approach fo autonomous vehicles
Dans cette thèse, nous étudions la robustesse d’un modèle multimodal de détection d’objets en 3D dans le contexte de véhicules autonomes. Les véhicules autonomes doivent détecter et localiser avec précision les piétons et les autres véhicules dans leur environnement 3D afin de conduire sur les routes en toute sécurité. La robustesse est l’un des aspects les plus importants d’un algorithme dans le problème de la perception 3D pour véhicules autonomes. C’est pourquoi, dans cette thèse, nous avons proposé une méthode pour évaluer la robustesse d’un modèle de détecteur d’objets en 3D. À cette fin, nous avons formé un détecteur d’objets 3D multimodal représentatif sur trois ensembles de données différents et nous avons effectué des tests sur des ensembles de données qui ont été construits avec précision pour démontrer la robustesse du modèle formé dans diverses conditions météorologiques et de luminosité. Notre méthode utilise deux approches différentes pour construire les ensembles de données proposés afin d’évaluer la robustesse. Dans une approche, nous avons utilisé des images artificiellement corrompues et dans l’autre, nous avons utilisé les images réelles dans des conditions météorologiques et de luminosité extrêmes. Afin de détecter des objets tels que des voitures et des piétons dans les scènes de circulation, le modèle multimodal s’appuie sur des images et des nuages de points 3D. Les approches multimodales pour la détection d’objets en 3D exploitent différents capteurs tels que des caméras et des détecteurs de distance pour détecter les objets d’intérêt dans l’environnement. Nous avons exploité trois ensembles de données bien connus dans le domaine de la conduite autonome, à savoir KITTI, nuScenes et Waymo. Nous avons mené des expériences approfondies pour étudier la méthode proposée afin d’évaluer la robustesse du modèle et nous avons fourni des résultats quantitatifs et qualitatifs. Nous avons observé que la méthode que nous proposons peut mesurer efficacement la robustesse du modèle.In this thesis, we study the robustness of a multimodal 3D object detection model in the context of autonomous vehicles. Self-driving cars need to accurately detect and localize pedestrians and other vehicles in their 3D surrounding environment to drive on the roads safely. Robustness is one of the most critical aspects of an algorithm in the self-driving car 3D perception problem. Therefore, in this work, we proposed a method to evaluate a 3D object detector’s robustness. To this end, we have trained a representative multimodal 3D object detector on three different datasets. Afterward, we evaluated the trained model on datasets that we have proposed and made to assess the robustness of the trained models in diverse weather and lighting conditions. Our method uses two different approaches for building the proposed datasets for evaluating the robustness. In one approach, we used artificially corrupted images, and in the other one, we used the real images captured in diverse weather and lighting conditions. To detect objects such as cars and pedestrians in the traffic scenes, the multimodal model relies on images and 3D point clouds. Multimodal approaches for 3D object detection exploit different sensors such as camera and range detectors for detecting the objects of interest in the surrounding environment. We leveraged three well-known datasets in the domain of autonomous driving consist of KITTI, nuScenes, and Waymo. We conducted extensive experiments to investigate the proposed method for evaluating the model’s robustness and provided quantitative and qualitative results. We observed that our proposed method can measure the robustness of the model effectively
UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking
In recent years, numerous effective multi-object tracking (MOT) methods are
developed because of the wide range of applications. Existing performance
evaluations of MOT methods usually separate the object tracking step from the
object detection step by using the same fixed object detection results for
comparisons. In this work, we perform a comprehensive quantitative study on the
effects of object detection accuracy to the overall MOT performance, using the
new large-scale University at Albany DETection and tRACking (UA-DETRAC)
benchmark dataset. The UA-DETRAC benchmark dataset consists of 100 challenging
video sequences captured from real-world traffic scenes (over 140,000 frames
with rich annotations, including occlusion, weather, vehicle category,
truncation, and vehicle bounding boxes) for object detection, object tracking
and MOT system. We evaluate complete MOT systems constructed from combinations
of state-of-the-art object detection and object tracking methods. Our analysis
shows the complex effects of object detection accuracy on MOT system
performance. Based on these observations, we propose new evaluation tools and
metrics for MOT systems that consider both object detection and object tracking
for comprehensive analysis.Comment: 18 pages, 11 figures, accepted by CVI
Benchmarking Robustness of AI-enabled Multi-sensor Fusion Systems: Challenges and Opportunities
Multi-Sensor Fusion (MSF) based perception systems have been the foundation
in supporting many industrial applications and domains, such as self-driving
cars, robotic arms, and unmanned aerial vehicles. Over the past few years, the
fast progress in data-driven artificial intelligence (AI) has brought a
fast-increasing trend to empower MSF systems by deep learning techniques to
further improve performance, especially on intelligent systems and their
perception systems. Although quite a few AI-enabled MSF perception systems and
techniques have been proposed, up to the present, limited benchmarks that focus
on MSF perception are publicly available. Given that many intelligent systems
such as self-driving cars are operated in safety-critical contexts where
perception systems play an important role, there comes an urgent need for a
more in-depth understanding of the performance and reliability of these MSF
systems. To bridge this gap, we initiate an early step in this direction and
construct a public benchmark of AI-enabled MSF-based perception systems
including three commonly adopted tasks (i.e., object detection, object
tracking, and depth completion). Based on this, to comprehensively understand
MSF systems' robustness and reliability, we design 14 common and realistic
corruption patterns to synthesize large-scale corrupted datasets. We further
perform a systematic evaluation of these systems through our large-scale
evaluation. Our results reveal the vulnerability of the current AI-enabled MSF
perception systems, calling for researchers and practitioners to take
robustness and reliability into account when designing AI-enabled MSF.Comment: Accepted by ESEC/FSE 202
- …