5,412 research outputs found

    Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications

    Full text link
    We present an overview and evaluation of a new, systematic approach for generation of highly realistic, annotated synthetic data for training of deep neural networks in computer vision tasks. The main contribution is a procedural world modeling approach enabling high variability coupled with physically accurate image synthesis, and is a departure from the hand-modeled virtual worlds and approximate image synthesis methods used in real-time applications. The benefits of our approach include flexible, physically accurate and scalable image synthesis, implicit wide coverage of classes and features, and complete data introspection for annotations, which all contribute to quality and cost efficiency. To evaluate our approach and the efficacy of the resulting data, we use semantic segmentation for autonomous vehicles and robotic navigation as the main application, and we train multiple deep learning architectures using synthetic data with and without fine tuning on organic (i.e. real-world) data. The evaluation shows that our approach improves the neural network's performance and that even modest implementation efforts produce state-of-the-art results.Comment: The project web page at http://vcl.itn.liu.se/publications/2017/TKWU17/ contains a version of the paper with high-resolution images as well as additional materia

    The Cityscapes Dataset for Semantic Urban Scene Understanding

    Full text link
    Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations; 20000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.Comment: Includes supplemental materia

    Fast and robust road sign detection in driver assistance systems

    Full text link
    © 2018, Springer Science+Business Media, LLC, part of Springer Nature. Road sign detection plays a critical role in automatic driver assistance systems. Road signs possess a number of unique visual qualities in images due to their specific colors and symmetric shapes. In this paper, road signs are detected by a two-level hierarchical framework that considers both color and shape of the signs. To address the problem of low image contrast, we propose a new color visual saliency segmentation algorithm, which uses the ratios of enhanced and normalized color values to capture color information. To improve computation efficiency and reduce false alarm rate, we modify the fast radial symmetry transform (RST) algorithm, and propose to use an edge pairwise voting scheme to group feature points based on their underlying symmetry in the candidate regions. Experimental results on several benchmarking datasets demonstrate the superiority of our method over the state-of-the-arts on both efficiency and robustness

    Detection and Recognition of Traffic Signs Inside the Attentional Visual Field of Drivers

    Get PDF
    Traffic sign detection and recognition systems are essential components of Advanced Driver Assistance Systems and self-driving vehicles. In this contribution we present a vision-based framework which detects and recognizes traffic signs inside the attentional visual field of drivers. This technique takes advantage of the driver\u27s 3D absolute gaze point obtained through the combined use of a front-view stereo imaging system and a non-contact 3D gaze tracker. We used a linear Support Vector Machine as a classifier and a Histogram of Oriented Gradient as features for detection. Recognition is performed by using Scale Invariant Feature Transforms and color information. Our technique detects and recognizes signs which are in the field of view of the driver and also provides indication when one or more signs have been missed by the driver

    An optimization approach for localization refinement of candidate traffic signs

    Get PDF
    We propose a localisation refinement approach for candidate traffic signs. Previous traffic sign localisation approaches which place a bounding rectangle around the sign do not always give a compact bounding box, making the subsequent classification task more difficult. We formulate localisation as a segmentation problem, and incorporate prior knowledge concerning color and shape of traffic signs. To evaluate the effectiveness of our approach, we use it as an intermediate step between a standard traffic sign localizer and a classifier. Our experiments use the well-known GTSDB benchmark as well as our new CTSDB (Chinese Traffic Sign Detection Benchmark). This newly created benchmark is publicly available, and goes beyond previous benchmark datasets: it has over 5,000 highresolution images containing more than 14,000 traffic signs taken in realistic driving conditions. Experimental results show that our localization approach significantly improves bounding boxes when compared to a standard localizer, thereby allowing a standard traffic sign classifier to generate more accurate classification results

    Improving pan-European speed-limit signs recognition with a new “global number segmentation” before digit recognition

    Get PDF
    International audienceIn this paper, we present an improved European speed-limit sign recognition system based on an original “global number segmentation” (inside detected circles) before digit segmentation and recognition. The global speed-limit sign detection and correct recognition rate, currently evaluated on videos recorded on a mix of French and German roads, is around 94 %, with a misclassification rate below 1%, and not a single validated false alarm in several hours of recorded videos. Our greyscale-based system is intrinsically insensitive to colour variability and quite robust to illumination variations, as shown by an on-road evaluation under bad weather conditions (cloudy and rainy) which yielded 84% good detection and recognition rate, and by a first night-time on-road evaluation with 75% correct detection rate. Due to recognition occurring at digit level, our system has the potential to be very easily extended to handle properly all variants of speed-limit signs from various European countries. Regarding computation load, videos with images of 640x480 pixels can be processed in real-time at ~20frames/s on a standard 2.13GHz dual-core laptop

    Indian Traffic Signboard Recognition and Driver Alert System Using Machine Learning

    Get PDF
    Sign board recognition and driver alert system which has a number of important application areas that include advance driver assistance systems, road surveying and autonomous vehicles. This system uses image processing technique to isolate relevant data which is captured from the real time streaming video. The proposed method is broadly divided in five part data collection, data processing, data classification, training and testing. System uses variety of image processing techniques to enhance the image quality and to remove non-informational pixel, and detecting edges. Feature extracter are used to find the features of image. Machine learning algorithm Support Vector Machine(SVM) is used to classify the images based on their features. If features of sign that are captured from the video matches with the trained traffic signs then it will generate the voice signal to alert the driver. In India there are different traffic sign board and they are classified into three categories: Regulatory sign, Cautionary sign, informational sign. These Indian signs have four different shapes and eight different colors. The proposed system is trained for ten different types of sign . In each category more than a thousand sample images are used to train the network

    An intelligent modular real-time vision-based system for environment perception

    Full text link
    A significant portion of driving hazards is caused by human error and disregard for local driving regulations; Consequently, an intelligent assistance system can be beneficial. This paper proposes a novel vision-based modular package to ensure drivers' safety by perceiving the environment. Each module is designed based on accuracy and inference time to deliver real-time performance. As a result, the proposed system can be implemented on a wide range of vehicles with minimum hardware requirements. Our modular package comprises four main sections: lane detection, object detection, segmentation, and monocular depth estimation. Each section is accompanied by novel techniques to improve the accuracy of others along with the entire system. Furthermore, a GUI is developed to display perceived information to the driver. In addition to using public datasets, like BDD100K, we have also collected and annotated a local dataset that we utilize to fine-tune and evaluate our system. We show that the accuracy of our system is above 80% in all the sections. Our code and data are available at https://github.com/Pandas-Team/Autonomous-Vehicle-Environment-PerceptionComment: Accepted in NeurIPS 2022 Workshop on Machine Learning for Autonomous Drivin
    • …
    corecore