763 research outputs found

    Connectivity-Enforcing Hough Transform for the Robust Extraction of Line Segments

    Full text link
    Global voting schemes based on the Hough transform (HT) have been widely used to robustly detect lines in images. However, since the votes do not take line connectivity into account, these methods do not deal well with cluttered images. In opposition, the so-called local methods enforce connectivity but lack robustness to deal with challenging situations that occur in many realistic scenarios, e.g., when line segments cross or when long segments are corrupted. In this paper, we address the critical limitations of the HT as a line segment extractor by incorporating connectivity in the voting process. This is done by only accounting for the contributions of edge points lying in increasingly larger neighborhoods and whose position and directional content agree with potential line segments. As a result, our method, which we call STRAIGHT (Segment exTRAction by connectivity-enforcInG HT), extracts the longest connected segments in each location of the image, thus also integrating into the HT voting process the usually separate step of individual segment extraction. The usage of the Hough space mapping and a corresponding hierarchical implementation make our approach computationally feasible. We present experiments that illustrate, with synthetic and real images, how STRAIGHT succeeds in extracting complete segments in several situations where current methods fail.Comment: Submitted for publicatio

    Computer Vision System-On-Chip Designs for Intelligent Vehicles

    Get PDF
    Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy. The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability. An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems. A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights. Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption. In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn\u27t rely on external memory for storage

    Computer Vision System-On-Chip Designs for Intelligent Vehicles

    Get PDF
    Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy. The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability. An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems. A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights. Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption. In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn\u27t rely on external memory for storage

    Detection, Quantification and Classification of Ripened Tomatoes: A Comparative Analysis of Image Processing and Machine Learning

    Get PDF
    In this paper, specifically for detection of ripe/unripe tomatoes with/without defects in the crop field, two distinct methods are described and compared. One is a machine learning approach, known as ‘Cascaded Object Detector’ and the other is a composition of traditional customized methods, individually known as ‘Colour Transformation’, ‘Colour Segmentation’ and ‘Circular Hough Transformation’. The (Viola Jones) Cascaded Object Detector generates ‘histogram of oriented gradient’ (HOG) features to detect tomatoes. For ripeness checking, the RGB mean is calculated with a set of rules. However, for traditional methods, color thresholding is applied to detect tomatoes either from a natural or solid background and RGB colour is adjusted to identify ripened tomatoes. In this work, Colour Segmentation is applied in the detection of tomatoes with defects, which has not previously been applied under machine learning techniques. The function modules of this algorithm are fed formatted images, captured by a camera mounted on a mobile robot. This robot was designed, built and operated in a tomato field to identify and quantify both green and ripened tomatoes as well as to detect damaged/blemished ones. This algorithm is shown to be optimally feasible for any micro-controller based miniature electronic devices in terms of its run time complexity of O(n3) for traditional method in best and average cases. Comparisons show that the accuracy of the machine learning method is 95%, better than that of the Colour Segmentation Method using MATLAB. This result is potentially significant for farmers in crop fields to identify the condition of tomatoes quickly

    Lane Detection System based on Hough Transform with Retinex Algorithm

    Get PDF
    Nowadays, automotive system becomes a great innovation in the world and lane detection system is important to control automobile vehicles. This paper has developed an efficient lane detection system to deal with different types of lighting conditions. Six types of edge detection techniques: canny, sobel, prewitt, Roberts, Laplacian of Gaussian (LOG) and zero-cross methods are analyzed. Line detection based on canny operator is developed. Moreover, Retinex algorithm is employed to normalize input images for all types of illumination. And Hough Transform with Retinex algorithm is developed to solve lighting problem. The proposed method is compared to Hough Transform with Otsu’s threshold method. The experimental results show that the proposed method can reduce computation time and improve accuracy for lane detection system

    Lane Departure and Front Collision Warning System Using Monocular and Stereo Vision

    Get PDF
    Driving Assistance Systems such as lane departure and front collision warning has caught great attention for its promising usage on road driving. This, this research focus on implementing lane departure and front collision warning at same time. In order to make the system really useful for real situation, it is critical that the whole process could be near real-time. Thus we chose Hough Transform as the main algorithm for detecting lane on the road. Hough Transform is used for that it is a very fast and robust algorithm, which makes it possible to execute as many frames as possible per frames. Hough Transform is used to get boundary information, so that we could decide if the car is doing lane departure based on the car\u27s position in lane. Later, we move on to use front car\u27s symmetry character to do front car detection, and combine it with Camshift tracking algorithm to fill the gap for failure of detection. Later we introduce camera calibration, stereo calibration, and how to calculate real distance from depth map

    Automatic detection of weld defects based on hough transform

    Get PDF
    Weld defect detection is an important application in the field of Non-Destructive Testing (NDT). These defects are mainly due to manufacturing errors or welding processes. In this context, image processing especially segmentation is proposed to detect and localize efficiently different types of defects. It is a challenging task since radiographic images have deficient contrast, poor quality and uneven illumination caused by the inspection techniques. The usual segmentation technique uses a region of interest ROI from the original image. In this article, a robust and automatic method is presented to detect linear defect from the original image without selection of ROI based on canny detector and a modified `Hough Transform' technique. This task can be subdivided into the following steps: firstly, preprocessing step with Gaussian filter and contrast stretching; secondly, segmentation technique is used to isolate weld region from background and non-weld using Adaptative Thresholding and to extract edges; thirdly, detection, location of linear defect and limiting the welding area by Hough Transform. The experimental results show that our proposed method gives good performance for industrial radiographic images

    Pedestrian lane detection in unstructured scenes for assistive navigation

    Get PDF
    Automatic detection of the pedestrian lane in a scene is an important task in assistive and autonomous navigation. This paper presents a vision-based algorithm for pedestrian lane detection in unstructured scenes, where lanes vary significantly in color, texture, and shape and are not indicated by any painted markers. In the proposed method, a lane appearance model is constructed adaptively from a sample image region, which is identified automatically from the image vanishing point. This paper also introduces a fast and robust vanishing point estimation method based on the color tensor and dominant orientations of color edge pixels. The proposed pedestrian lane detection method is evaluated on a new benchmark dataset that contains images from various indoor and outdoor scenes with different types of unmarked lanes. Experimental results are presented which demonstrate its efficiency and robustness in comparison with several existing methods

    Assessment of Driver\u27s Attention to Traffic Signs through Analysis of Gaze and Driving Sequences

    Get PDF
    A driver’s behavior is one of the most significant factors in Advance Driver Assistance Systems. One area that has received little study is just how observant drivers are in seeing and recognizing traffic signs. In this contribution, we present a system considering the location where a driver is looking (points of gaze) as a factor to determine that whether the driver has seen a sign. Our system detects and classifies traffic signs inside the driver’s attentional visual field to identify whether the driver has seen the traffic signs or not. Based on the results obtained from this stage which provides quantitative information, our system is able to determine how observant of traffic signs that drivers are. We take advantage of the combination of Maximally Stable Extremal Regions algorithm and Color information in addition to a binary linear Support Vector Machine classifier and Histogram of Oriented Gradients as features detector for detection. In classification stage, we use a multi class Support Vector Machine for classifier also Histogram of Oriented Gradients for features. In addition to the detection and recognition of traffic signs, our system is capable of determining if the sign is inside the attentional visual field of the drivers. It means the driver has kept his gaze on traffic signs and sees the sign, while if the sign is not inside this area, the driver did not look at the sign and sign has been missed
    • …
    corecore