5,412 research outputs found
Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications
We present an overview and evaluation of a new, systematic approach for
generation of highly realistic, annotated synthetic data for training of deep
neural networks in computer vision tasks. The main contribution is a procedural
world modeling approach enabling high variability coupled with physically
accurate image synthesis, and is a departure from the hand-modeled virtual
worlds and approximate image synthesis methods used in real-time applications.
The benefits of our approach include flexible, physically accurate and scalable
image synthesis, implicit wide coverage of classes and features, and complete
data introspection for annotations, which all contribute to quality and cost
efficiency. To evaluate our approach and the efficacy of the resulting data, we
use semantic segmentation for autonomous vehicles and robotic navigation as the
main application, and we train multiple deep learning architectures using
synthetic data with and without fine tuning on organic (i.e. real-world) data.
The evaluation shows that our approach improves the neural network's
performance and that even modest implementation efforts produce
state-of-the-art results.Comment: The project web page at
http://vcl.itn.liu.se/publications/2017/TKWU17/ contains a version of the
paper with high-resolution images as well as additional materia
The Cityscapes Dataset for Semantic Urban Scene Understanding
Visual understanding of complex urban street scenes is an enabling factor for
a wide range of applications. Object detection has benefited enormously from
large-scale datasets, especially in the context of deep learning. For semantic
urban scene understanding, however, no current dataset adequately captures the
complexity of real-world urban scenes.
To address this, we introduce Cityscapes, a benchmark suite and large-scale
dataset to train and test approaches for pixel-level and instance-level
semantic labeling. Cityscapes is comprised of a large, diverse set of stereo
video sequences recorded in streets from 50 different cities. 5000 of these
images have high quality pixel-level annotations; 20000 additional images have
coarse annotations to enable methods that leverage large volumes of
weakly-labeled data. Crucially, our effort exceeds previous attempts in terms
of dataset size, annotation richness, scene variability, and complexity. Our
accompanying empirical study provides an in-depth analysis of the dataset
characteristics, as well as a performance evaluation of several
state-of-the-art approaches based on our benchmark.Comment: Includes supplemental materia
Fast and robust road sign detection in driver assistance systems
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. Road sign detection plays a critical role in automatic driver assistance systems. Road signs possess a number of unique visual qualities in images due to their specific colors and symmetric shapes. In this paper, road signs are detected by a two-level hierarchical framework that considers both color and shape of the signs. To address the problem of low image contrast, we propose a new color visual saliency segmentation algorithm, which uses the ratios of enhanced and normalized color values to capture color information. To improve computation efficiency and reduce false alarm rate, we modify the fast radial symmetry transform (RST) algorithm, and propose to use an edge pairwise voting scheme to group feature points based on their underlying symmetry in the candidate regions. Experimental results on several benchmarking datasets demonstrate the superiority of our method over the state-of-the-arts on both efficiency and robustness
Detection and Recognition of Traffic Signs Inside the Attentional Visual Field of Drivers
Traffic sign detection and recognition systems are essential components of Advanced Driver Assistance Systems and self-driving vehicles. In this contribution we present a vision-based framework which detects and recognizes traffic signs inside the attentional visual field of drivers. This technique takes advantage of the driver\u27s 3D absolute gaze point obtained through the combined use of a front-view stereo imaging system and a non-contact 3D gaze tracker. We used a linear Support Vector Machine as a classifier and a Histogram of Oriented Gradient as features for detection. Recognition is performed by using Scale Invariant Feature Transforms and color information. Our technique detects and recognizes signs which are in the field of view of the driver and also provides indication when one or more signs have been missed by the driver
An optimization approach for localization refinement of candidate traffic signs
We propose a localisation refinement approach for
candidate traffic signs. Previous traffic sign localisation approaches
which place a bounding rectangle around the sign do
not always give a compact bounding box, making the subsequent
classification task more difficult. We formulate localisation
as a segmentation problem, and incorporate prior knowledge
concerning color and shape of traffic signs. To evaluate the
effectiveness of our approach, we use it as an intermediate step
between a standard traffic sign localizer and a classifier. Our
experiments use the well-known GTSDB benchmark as well as
our new CTSDB (Chinese Traffic Sign Detection Benchmark).
This newly created benchmark is publicly available, and goes
beyond previous benchmark datasets: it has over 5,000 highresolution
images containing more than 14,000 traffic signs
taken in realistic driving conditions. Experimental results show
that our localization approach significantly improves bounding
boxes when compared to a standard localizer, thereby allowing
a standard traffic sign classifier to generate more accurate
classification results
Improving pan-European speed-limit signs recognition with a new “global number segmentation” before digit recognition
International audienceIn this paper, we present an improved European speed-limit sign recognition system based on an original “global number segmentation” (inside detected circles) before digit segmentation and recognition. The global speed-limit sign detection and correct recognition rate, currently evaluated on videos recorded on a mix of French and German roads, is around 94 %, with a misclassification rate below 1%, and not a single validated false alarm in several hours of recorded videos. Our greyscale-based system is intrinsically insensitive to colour variability and quite robust to illumination variations, as shown by an on-road evaluation under bad weather conditions (cloudy and rainy) which yielded 84% good detection and recognition rate, and by a first night-time on-road evaluation with 75% correct detection rate. Due to recognition occurring at digit level, our system has the potential to be very easily extended to handle properly all variants of speed-limit signs from various European countries. Regarding computation load, videos with images of 640x480 pixels can be processed in real-time at ~20frames/s on a standard 2.13GHz dual-core laptop
Indian Traffic Signboard Recognition and Driver Alert System Using Machine Learning
Sign board recognition and driver alert system which has a number of important application areas that include advance driver assistance systems, road surveying and autonomous vehicles. This system uses image processing technique to isolate relevant data which is captured from the real time streaming video. The proposed method is broadly divided in five part data collection, data processing, data classification, training and testing. System uses variety of image processing techniques to enhance the image quality and to remove non-informational pixel, and detecting edges. Feature extracter are used to find the features of image. Machine learning algorithm Support Vector Machine(SVM) is used to classify the images based on their features. If features of sign that are captured from the video matches with the trained traffic signs then it will generate the voice signal to alert the driver. In India there are different traffic sign board and they are classified into three categories: Regulatory sign, Cautionary sign, informational sign. These Indian signs have four different shapes and eight different colors. The proposed system is trained for ten different types of sign . In each category more than a thousand sample images are used to train the network
An intelligent modular real-time vision-based system for environment perception
A significant portion of driving hazards is caused by human error and
disregard for local driving regulations; Consequently, an intelligent
assistance system can be beneficial. This paper proposes a novel vision-based
modular package to ensure drivers' safety by perceiving the environment. Each
module is designed based on accuracy and inference time to deliver real-time
performance. As a result, the proposed system can be implemented on a wide
range of vehicles with minimum hardware requirements. Our modular package
comprises four main sections: lane detection, object detection, segmentation,
and monocular depth estimation. Each section is accompanied by novel techniques
to improve the accuracy of others along with the entire system. Furthermore, a
GUI is developed to display perceived information to the driver. In addition to
using public datasets, like BDD100K, we have also collected and annotated a
local dataset that we utilize to fine-tune and evaluate our system. We show
that the accuracy of our system is above 80% in all the sections. Our code and
data are available at
https://github.com/Pandas-Team/Autonomous-Vehicle-Environment-PerceptionComment: Accepted in NeurIPS 2022 Workshop on Machine Learning for Autonomous
Drivin
- …