200 research outputs found

    Vehicle detection using background subtraction and clustering algorithms

    Get PDF
    Traffic congestion has raised worldwide as a result of growing motorization, urbanization, and population. In fact, congestion reduces the efficiency of transportation infrastructure usage and increases travel time, air pollutions as well as fuel consumption. Then, Intelligent Transportation System (ITS) comes as a solution of this problem by implementing information technology and communications networks. One classical option of Intelligent Transportation Systems is video camera technology. Particularly, the video system has been applied to collect traffic data including vehicle detection and analysis. However, this application still has limitation when it has to deal with a complex traffic and environmental condition. Thus, the research proposes OTSU, FCM and K-means methods and their comparison in video image processing. OTSU is a classical algorithm used in image segmentation, which is able to cluster pixels into foreground and background. However, only FCM (Fuzzy C-Means) and K-means algorithms have been successfully applied to cluster pixels without supervision. Therefore, these methods seem to be more potential to generate the MSE values for defining a clearer threshold for background subtraction on a moving object with varying environmental conditions. Comparison of these methods is assessed from MSE and PSNR values. The best MSE result is demonstrated from K-means and a good PSNR is obtained from FCM. Thus, the application of the clustering algorithms in detection of moving objects in various condition is more promising

    Multi Cost Function Fuzzy Stereo Matching Algorithm for Object Detection and Robot Motion Control

    Get PDF
    Stereo matching algorithms work with multiple images of a scene, taken from two viewpoints, to generate depth information. Authors usually use a single matching function to generate similarity between corresponding regions in the images. In the present research, the authors have considered a combination of multiple data costs for disparity generation. Disparity maps generated from stereo images tend to have noisy sections. The presented research work is related to a methodology to refine such disparity maps such that they can be further processed to detect obstacle regions.  A novel entropy based selective refinement (ESR) technique is proposed to refine the initial disparity map. The information from both the left disparity and right disparity maps are used for this refinement technique. For every disparity map, block wise entropy is calculated. The average entropy values of the corresponding positions in the disparity maps are compared. If the variation between these entropy values exceeds a threshold, then the corresponding disparity value is replaced with the mean disparity of the block with lower entropy. The results of this refinement are compared with similar methods and was observed to be better. Furthermore, in this research work, the v-disparity values are used to highlight the road surface in the disparity map. The regions belonging to the sky are removed through HSV based segmentation. The remaining regions which are our ROIs, are refined through a u-disparity area-based technique.  Based on this, the closest obstacles are detected through the use of k-means segmentation.  The segmented regions are further refined through a u-disparity image information-based technique and used as masks to highlight obstacle regions in the disparity maps. This information is used in conjunction with a kalman filter based path planning algorithm to guide a mobile robot from a source location to a destination location while also avoiding any obstacle detected in its path. A stereo camera setup was built and the performance of the algorithm on local real-life images, captured through the cameras, was observed. The evaluation of the proposed methodologies was carried out using real life out door images obtained from KITTI dataset and images with radiometric variations from Middlebury stereo dataset

    Deconvolutional networks for point-cloud vehicle detection and tracking in driving scenarios

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Vehicle detection and tracking is a core ingredient for developing autonomous driving applications in urban scenarios. Recent image-based Deep Learning (DL) techniques are obtaining breakthrough results in these perceptive tasks. However, DL research has not yet advanced much towards processing 3D point clouds from lidar range-finders. These sensors are very common in autonomous vehicles since, despite not providing as semantically rich information as images, their performance is more robust under harsh weather conditions than vision sensors. In this paper we present a full vehicle detection and tracking system that works with 3D lidar information only. Our detection step uses a Convolutional Neural Network (CNN) that receives as input a featured representation of the 3D information provided by a Velodyne HDL-64 sensor and returns a per-point classification of whether it belongs to a vehicle or not. The classified point cloud is then geometrically processed to generate observations for a multi-object tracking system implemented via a number of Multi-Hypothesis Extended Kalman Filters (MH-EKF) that estimate the position and velocity of the surrounding vehicles. The system is thoroughly evaluated on the KITTI tracking dataset, and we show the performance boost provided by our CNN-based vehicle detector over a standard geometric approach. Our lidar-based approach uses about a 4% of the data needed for an image-based detector with similarly competitive results.Peer ReviewedPostprint (author's final draft

    Object Tracking in Distributed Video Networks Using Multi-Dimentional Signatures

    Get PDF
    From being an expensive toy in the hands of governmental agencies, computers have evolved a long way from the huge vacuum tube-based machines to today\u27s small but more than thousand times powerful personal computers. Computers have long been investigated as the foundation for an artificial vision system. The computer vision discipline has seen a rapid development over the past few decades from rudimentary motion detection systems to complex modekbased object motion analyzing algorithms. Our work is one such improvement over previous algorithms developed for the purpose of object motion analysis in video feeds. Our work is based on the principle of multi-dimensional object signatures. Object signatures are constructed from individual attributes extracted through video processing. While past work has proceeded on similar lines, the lack of a comprehensive object definition model severely restricts the application of such algorithms to controlled situations. In conditions with varying external factors, such algorithms perform less efficiently due to inherent assumptions of constancy of attribute values. Our approach assumes a variable environment where the attribute values recorded of an object are deemed prone to variability. The variations in the accuracy in object attribute values has been addressed by incorporating weights for each attribute that vary according to local conditions at a sensor location. This ensures that attribute values with higher accuracy can be accorded more credibility in the object matching process. Variations in attribute values (such as surface color of the object) were also addressed by means of applying error corrections such as shadow elimination from the detected object profile. Experiments were conducted to verify our hypothesis. The results established the validity of our approach as higher matching accuracy was obtained with our multi-dimensional approach than with a single-attribute based comparison

    Técnicas de inteligencia artificial aplicadas a sistemas de detección y clasificación de señales de tráfico.

    Get PDF
    Esta tesis, presentada como conjunto de artículos de investigación, estudia y analiza soluciones para los sistemas de detección y clasificación de señales de tráfico que suponen un reto en aplicaciones de la actualidad, como son la seguridad y asistencia en carretera a conductores, los coches autónomos, el mantenimiento de señalización vertical, o el análisis de escenas de tráfico. Las señales de tráfico constituyen un activo fundamental dentro de la red decarreteras porque su objetivo es ser fácilmente perceptible por los peatones y conductores para advertirles y guiarlos tanto de día como de noche. El hecho de que las señales estén diseñadas para ser únicas y tener características distinguibles, como formas simples y colores uniformes, implica que su detección y reconocimiento sea un problema limitado. Sin embargo, el desarrollo de un sistema de reconocimiento de señales en tiempo real aún presenta desafíos debido a los tiempos de respuesta, los cuales son cruciales para tomar decisiones en el entorno, y la variabilidad que presentan las imágenes de escenas de tráfico, que pueden incluir imágenes a distintas escalas, puntos de vista complicados, oclusiones, y diferentes condiciones de luz. Cualquier sistema de detección y clasificación de señales de tráfico debe hacer frente a estos retos. En este trabajo, se presenta un sistema de clasificación de señales de tráfico basado en aprendizaje profundo (Deep Learning). Concretamente, los principales componentes de la red neuronal profunda (Deep Neural Network) propuesta, son capas convolucionales y redes de transformaciones espaciales (Spatial Transformer Networks). Dicha red es alimentada con imágenes RGB de señales de tráfico de distintos países como Alemania, Bélgica o España. En el caso de las señales de Alemania, que pertenecen al dataset denominado German Traffic Sign Recognition Benchmark (GTSRB), la arquitectura de red y los parámetros de optimización propuestos obtienen un 99.71% de precisión, mejorando tanto al sistema visual humano como a todos los resultados previos del estado del arte, siendo además más eficiente en términos de requisitos de memoria. En el momento de redactar esta tesis, nuestro método se encuentra en la primera posición de la clasificación a nivel mundial. Por otro lado, respecto a la problemática de la detección de señales de tráfico, se analizan varios sistemas de detección de objetos propuestos en el estado del arte, que son específicamente modificados y adaptados al dominio del problema que nos ocupa para aplicar la transferencia de conocimiento en redes neuronales (transfer learning). También se estudian múltiples parámetros de rendimiento para cada uno de los modelos de detección con el fin de ofrecer al lector cuál sería el mejor detector de señales teniendo en cuenta restricciones del entorno donde se desplegará la solución, como la precisión, el consumo de memoria o la velocidad de ejecución. Nuestro estudio muestra que el modelo Faster R-CNN Inception Resnet V2 obtiene la mejor precisión (95.77% mAP), mientras que R-FCN Resnet 101 alcanza el mejor equilibrio entre tiempo de ejecución (85.45 ms por imagen) y precisión (95.15% mAP)

    Pedestrian detection and tracking using stereo vision techniques

    Get PDF
    Automated pedestrian detection, counting and tracking has received significant attention from the computer vision community of late. Many of the person detection techniques described so far in the literature work well in controlled environments, such as laboratory settings with a small number of people. This allows various assumptions to be made that simplify this complex problem. The performance of these techniques, however, tends to deteriorate when presented with unconstrained environments where pedestrian appearances, numbers, orientations, movements, occlusions and lighting conditions violate these convenient assumptions. Recently, 3D stereo information has been proposed as a technique to overcome some of these issues and to guide pedestrian detection. This thesis presents such an approach, whereby after obtaining robust 3D information via a novel disparity estimation technique, pedestrian detection is performed via a 3D point clustering process within a region-growing framework. This clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. This pedestrian detection technique requires no external training and is able to robustly handle challenging real-world unconstrained environments from various camera positions and orientations. In addition, this thesis presents a continuous detect-and-track approach, with additional kinematic constraints and explicit occlusion analysis, to obtain robust temporal tracking of pedestrians over time. These approaches are experimentally validated using challenging datasets consisting of both synthetic data and real-world sequences gathered from a number of environments. In each case, the techniques are evaluated using both 2D and 3D groundtruth methodologies

    Real-time vehicle detection using low-cost sensors

    Get PDF
    Improving road safety and reducing the number of accidents is one of the top priorities for the automotive industry. As human driving behaviour is one of the top causation factors of road accidents, research is working towards removing control from the human driver by automating functions and finally introducing a fully Autonomous Vehicle (AV). A Collision Avoidance System (CAS) is one of the key safety systems for an AV, as it ensures all potential threats ahead of the vehicle are identified and appropriate action is taken. This research focuses on the task of vehicle detection, which is the base of a CAS, and attempts to produce an effective vehicle detector based on the data coming from a low-cost monocular camera. Developing a robust CAS based on low-cost sensor is crucial to bringing the cost of safety systems down and in this way, increase their adoption rate by end users. In this work, detectors are developed based on the two main approaches to vehicle detection using a monocular camera. The first is the traditional image processing approach where visual cues are utilised to generate potential vehicle locations and at a second stage, verify the existence of vehicles in an image. The second approach is based on a Convolutional Neural Network, a computationally expensive method that unifies the detection process in a single pipeline. The goal is to determine which method is more appropriate for real-time applications. Following the first approach, a vehicle detector based on the combination of HOG features and SVM classification is developed. The detector attempts to optimise performance by modifying the detection pipeline and improve run-time performance. For the CNN-based approach, six different network models are developed and trained end to end using collected data, each with a different network structure and parameters, in an attempt to determine which combination produces the best results. The evaluation of the different vehicle detectors produced some interesting findings; the first approach did not manage to produce a working detector, while the CNN-based approach produced a high performing vehicle detector with an 85.87% average precision and a very low miss rate. The detector managed to perform well under different operational environments (motorway, urban and rural roads) and the results were validated using an external dataset. Additional testing of the vehicle detector indicated it is suitable as a base for safety applications such as CAS, with a run time performance of 12FPS and potential for further improvements.</div

    Vehicle Lane Departure Prediction Based On Support Vector Machines

    Get PDF
    Advanced driver assistance systems, such as unintentional lane departure warning systems, have recently drawn much attention and R & D efforts. Such a system will assist the driver by monitoring the driver or vehicle behaviors to predict/detect driving situations (e.g., lane departure) and alert the driver to take corrective action. In this dissertation, we explored utilizing the nonlinear binary support vector machine (SVM) technique and the time series of vehicle variables to predict unintentional lane departure, which is innovative as no machine learning technique has previously been attempted for this purpose in the literature. Furthermore, we developed a two-stage training scheme to improve SVM\u27s prediction performance. Our SVMs were trained and tested using the experiment data generated by VIRTTEX, a hydraulically powered 6-degrees-of-freedom moving base driving simulator at Ford Motor Company. The data represented 16 drowsy drivers (about three-hour driving time per subject) and six control drivers (approximately 20 minutes driving per subject), all of which drove a simulated 2000 Volvo S80. More than 100 vehicle variables were sampled at 50 Hz. There were a total of 3,508 unintentional lane departure occurrences for the 16 drowsy drivers and 23 for four of the six control drivers (two had none). We optimized the performances of the SVMs by experimentally finding their best kernel functions and parameter values as well as the most appropriate vehicle variables as their input variables. Our experiment results involving the 22 drivers with a total of over 6.84 million prediction decisions demonstrate that: (1) the two-stage training scheme significantly outperformed the commonly used (one-stage) training scheme, (2) excellent SVM performances, as measured by numbers of false positives and false negatives, were achieved when the prediction horizon was set at 0.6 s or shorter, (3) lateral position and lateral velocity served as the best input variables among the nine variable sets that we explored, and (4) the radical basis function was the best kernel function (the other two kernel functions that we tested were the linear function and the second-order polynomial). We conclude that the two-stage-training SVM approach deserves further exploration because to the best of our knowledge, it has demonstrated the best unintentional lane departure prediction performance relative to the literature

    Auralisation of Traffic Flow using Procedural Audio Methods

    Get PDF
    This thesis investigates approaches for the auralisation of traffic noise in an outdoor environment. A novel auralisation framework for multiple vehicle pass-bys using procedural audio methods is proposed. This includes sound source modelling of single vehicle pass-bys and traffic flow, sound propagation modelling, and HRTF processing for spatial audio reproduction. Compared to prior auralisation studies in which sound source recordings have been used, no pre-recorded sounds are used with a procedural audio approach. Instead, synthetic sounds created by programmatic rules form the basis of the auralisation framework proposed in this thesis. Such an auralisation based on procedural audio gives greater freedom and range in the implementation and integration of vehicle pass-by sounds, with the advantage of high flexibility and variable computational cost for the algorithms defining the properties of any given audio objects. However, such synthetic sounds might not be perceived as being plausible when compared to their recorded counterparts, especially for the case of traffic noise where it is difficult to imitate the intrinsic rich and varied sound source content by artificial means. Therefore, two subjective listening tests are implemented to evaluate the plausibility of the proposed auralisation framework by comparing procedurally generated vehicle sounds to their counterparts created using a recording-based granular synthesis method. Engine sounds, engine plus tyre sounds, and single vehicle pass-by sounds, all generated using a procedural audio approach, are compared with their counterparts created using a granular synthesis method, and evaluated in an ABX listening test. It is found that a similar level of plausibility is achieved by using either method for the auralisation of single vehicle pass-bys. Based on this validation, the plausibility of multiple vehicle pass-by sounds with engines synthesised using a procedural, a mix of procedural and granular, and granular approaches is evaluated in a MUSHRA test under various traffic flow conditions regarding different vehicle types, speeds, driving directions, and flow rates. It is found that a similar level of plausibility is achieved by using either method under most traffic flow conditions. These results verify that the auralisation of traffic flow using procedural audio methods is comparable to recording-based approaches when considering the plausibility of the results obtained. Such an approach provides a solution for implementing the auralisation of environmental sounds that is both flexible and plausible, which is useful for communicating and demonstrating the important changes in our soundscape to the wider population, leading to a more holistic understanding of environmental sound
    corecore