6 research outputs found
FPGA-based multi-view stereo system with flexible measurement setup
In recent years, stereoscopic image processing algorithms have gained importance for a variety of applications. To capture larger measurement volumes, multiple stereo systems are combined into a multi-view stereo (MVS) system. To reduce the amount of data and the data rate, calculation steps close to the sensors are outsourced to Field Programmable Gate Arrays (FPGAs) as upstream computing units. The calculation steps include lens distortion correction, rectification and stereo matching. In this paper a FPGA-based MVS system with flexible camera arrangement and partly overlapping field of view is presented. The system consists of four FPGA-based passive stereoscopic systems (Xilinx Zynq-7000 7020 SoC, EV76C570 CMOS sensor) and a downstream processing unit (Zynq Ultrascale ZU9EG SoC). This synchronizes the sensor near processing modules and receives the disparity maps with corresponding left camera image via HDMI. The subsequent computing unit calculates a coherent 3D point cloud. Our developed FPGA-based 3D measurement system captures a large measurement volume at 24 fps by combining a multiple view with eight cameras (using Semi-Global Matching for an image size of 640 px × 460 px, up to 256 px disparity range and with aggregated costs over 4 directions). The capabilities and limitation of the system are shown by an application example with optical non-cooperative surface
Recommended from our members
Computer Vision System-On-Chip Designs for Intelligent Vehicles
Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy.
The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability.
An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems.
A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights.
Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption.
In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn\u27t rely on external memory for storage
An Investigation Into Time Gazed At Traffic Objects By Drivers
Several studies have considered driver’s attention for a multitude of distinct purposes, ranging from the analysis of a driver’s gaze and perception, to possible use in Advanced Driving Assistance Systems (ADAS). These works typically rely on simple definitions of what it means to “see,” considering a driver gazing upon an object for a single frame as being seen. In this work, we bolster this definition by introducing the concept of time. We consider a definition of ”seen” which requires an object to be gazed upon for a set length of time, or frames, before it can be considered as seen by the driver. This is done by examining consecutive frames to find those where the driver’s gaze remains uninterrupted within a constant bounding box of a given traffic object over a series of frames. A time-considering approach to defining traffic objects as seen or unseen provides a more thoughtful and accurate measure of driver’s perception, as we avoid the naive assumption that gazing upon an object for a single frame is enough time for a driver to process the object gazed upon, which ultimately could prove vital to a wide array of ADAS and i-ADAS systems
A 502-GOPS and 0.984-mW Dual-Mode Intelligent ADAS SoC With Real-Time Semiglobal Matching and Intention Prediction for Smart Automotive Black Box System
The advanced driver assistance system (ADAS) for adaptive cruise control and collision avoidance is strongly dependent upon the robust image recognition technology such as lane detection, vehicle/pedestrian detection, and traffic sign recognition. However, the conventional ADAS cannot realize more advanced collision evasion in real environments due to the absence of intelligent vehicle/pedestrian behavior analysis. Moreover, accurate distance estimation is essential in ADAS applications and semiglobal matching (SGM) is most widely adopted for high accuracy, but its system-on-chip (SoC) implementation is difficult due to the massive external memory bandwidth. In this paper, an ADAS SoC with behavior analysis with Artificial Intelligence functions and hardware implementation of SGM is proposed. The proposed SoC has dual-mode operations of highperformance operation for intelligent ADAS with real-time SGM in D-Mode (d-mode) and ultralow-power operation for black box system in parking-mode. It features: 1) task-level pipelined SGM processor to reduce external memory bandwidth by 85.8%; 2) region-of-interest generation processor to reduce 86.2% of computation; 3) mixed-mode intention prediction engine for dualmode intelligence; and 4) dynamic voltage and frequency scaling control to save 36.2% of power in d-mode. The proposed ADAS processor achieves 862 GOPS/W energy efficiency and 31.4GOPS/ mm(2) area efficiency, which are 1.53x and 1.75x improvements than the state of the art, with 30 frames/s throughput under 720p stereo inputs
Proceedings of the Scientific-Practical Conference "Research and Development - 2016"
talent management; sensor arrays; automatic speech recognition; dry separation technology; oil production; oil waste; laser technolog
Proceedings of the Scientific-Practical Conference "Research and Development - 2016"
talent management; sensor arrays; automatic speech recognition; dry separation technology; oil production; oil waste; laser technolog