1,259 research outputs found

    An Empirical Evaluation of Deep Learning on Highway Driving

    Full text link
    Numerous groups have applied a variety of deep learning techniques to computer vision problems in highway perception scenarios. In this paper, we presented a number of empirical evaluations of recent deep learning advances. Computer vision, combined with deep learning, has the potential to bring about a relatively inexpensive, robust solution to autonomous driving. To prepare deep learning for industry uptake and practical applications, neural networks will require large data sets that represent all possible driving environments and scenarios. We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection. We show how existing convolutional neural networks (CNNs) can be used to perform lane and vehicle detection while running at frame rates required for a real-time system. Our results lend credence to the hypothesis that deep learning holds promise for autonomous driving.Comment: Added a video for lane detectio

    FuSSI-Net: Fusion of Spatio-temporal Skeletons for Intention Prediction Network

    Full text link
    Pedestrian intention recognition is very important to develop robust and safe autonomous driving (AD) and advanced driver assistance systems (ADAS) functionalities for urban driving. In this work, we develop an end-to-end pedestrian intention framework that performs well on day- and night- time scenarios. Our framework relies on objection detection bounding boxes combined with skeletal features of human pose. We study early, late, and combined (early and late) fusion mechanisms to exploit the skeletal features and reduce false positives as well to improve the intention prediction performance. The early fusion mechanism results in AP of 0.89 and precision/recall of 0.79/0.89 for pedestrian intention classification. Furthermore, we propose three new metrics to properly evaluate the pedestrian intention systems. Under these new evaluation metrics for the intention prediction, the proposed end-to-end network offers accurate pedestrian intention up to half a second ahead of the actual risky maneuver.Comment: 5 pages, 6 figures, 5 tables, IEEE Asilomar SS

    Computer Vision System-On-Chip Designs for Intelligent Vehicles

    Get PDF
    Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy. The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability. An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems. A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights. Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption. In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn\u27t rely on external memory for storage

    Computer Vision System-On-Chip Designs for Intelligent Vehicles

    Get PDF
    Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy. The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability. An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems. A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights. Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption. In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn\u27t rely on external memory for storage

    Scene Detection Classification and Tracking for Self-Driven Vehicle

    Get PDF
    A number of traffic-related issues, including crashes, jams, and pollution, could be resolved by self-driving vehicles (SDVs). Several challenges still need to be overcome, particularly in the areas of precise environmental perception, observed detection, and its classification, to allow the safe navigation of autonomous vehicles (AVs) in crowded urban situations. This article offers a comprehensive examination of the application of deep learning techniques in self-driving cars for scene perception and observed detection. The theoretical foundations of self-driving cars are examined in depth in this research using a deep learning methodology. It explores the current applications of deep learning in this area and provides critical evaluations of their efficacy. This essay begins with an introduction to the ideas of computer vision, deep learning, and self-driving automobiles. It also gives a brief review of artificial general intelligence, highlighting its applicability to the subject at hand. The paper then concentrates on categorising current, robust deep learning libraries and considers their critical contribution to the development of deep learning techniques. The dataset used as label for scene detection for self-driven vehicle. The discussion of several strategies that explicitly handle picture perception issues faced in real-time driving scenarios takes up a sizeable amount of the work. These methods include methods for item detection, recognition, and scene comprehension. In this study, self-driving automobile implementations and tests are critically assessed

    Driving in the Rain: A Survey toward Visibility Estimation through Windshields

    Get PDF
    Rain can significantly impair the driver’s sight and affect his performance when driving in wet conditions. Evaluation of driver visibility in harsh weather, such as rain, has garnered considerable research since the advent of autonomous vehicles and the emergence of intelligent transportation systems. In recent years, advances in computer vision and machine learning led to a significant number of new approaches to address this challenge. However, the literature is fragmented and should be reorganised and analysed to progress in this field. There is still no comprehensive survey article that summarises driver visibility methodologies, including classic and recent data-driven/model-driven approaches on the windshield in rainy conditions, and compares their generalisation performance fairly. Most ADAS and AD systems are based on object detection. Thus, rain visibility plays a key role in the efficiency of ADAS/AD functions used in semi- or fully autonomous driving. This study fills this gap by reviewing current state-of-the-art solutions in rain visibility estimation used to reconstruct the driver’s view for object detection-based autonomous driving. These solutions are classified as rain visibility estimation systems that work on (1) the perception components of the ADAS/AD function, (2) the control and other hardware components of the ADAS/AD function, and (3) the visualisation and other software components of the ADAS/AD function. Limitations and unsolved challenges are also highlighted for further research
    • …
    corecore