216 research outputs found

    Real-time Detection of Vehicles Using the Haar-like Features and Artificial Neuron Networks

    Get PDF
    AbstractIn this document, a vehicle detection system is presented. This system is based on two algorithms, a descriptor of the image type haar-like, and a classifier type artificial neuron networks. In order to ensure rapidity in the calculation extracts features by the descriptor the concept of the integral image is used for the representation of the image. The learning of the system is performed on a set of positive images (vehicles) and negative images (non-vehicle), and the test is done on another set of scenes (positive or negative). To address the performance of the proposed system by varying one element among the determining parameters which is the number of neurons in the hidden layer; the results obtained have shown that the proposed system is a fast and robust vehicle detector

    VEHICLE DETECTION FOR NIGHTTIME USING MONOCULAR IR CAMERA WITH DISCRIMINATELY TRAINED MIXTURE OF DEFORMABLE PART MODELS

    Get PDF
    Vehicle detection at night time is a challenging problem due to low visibility and light distortion caused by motion and illumination in urban environments. This paper presents a method based on the deformable object model for detecting and classifying vehicles using monocular infra-red camera. In proposed method, features of vehicles are learned as a deformable object model through the combination of a latent support vector machine (LSVM) and histograms of oriented gradients (HOG). The proposed detection algorithm is flexible enough in detecting various types and orientations of vehicles as it can effectively integrate both global and local information of vehicle textures and shapes. Experimental results prove the effectiveness of the algorithm for detecting close and medium range vehicles in urban scenes at night time

    Parking lot monitoring system using an autonomous quadrotor UAV

    Get PDF
    The main goal of this thesis is to develop a drone-based parking lot monitoring system using low-cost hardware and open-source software. Similar to wall-mounted surveillance cameras, a drone-based system can monitor parking lots without affecting the flow of traffic while also offering the mobility of patrol vehicles. The Parrot AR Drone 2.0 is the quadrotor drone used in this work due to its modularity and cost efficiency. Video and navigation data (including GPS) are communicated to a host computer using a Wi-Fi connection. The host computer analyzes navigation data using a custom flight control loop to determine control commands to be sent to the drone. A new license plate recognition pipeline is used to identify license plates of vehicles from video received from the drone

    Real-time vehicle detection using low-cost sensors

    Get PDF
    Improving road safety and reducing the number of accidents is one of the top priorities for the automotive industry. As human driving behaviour is one of the top causation factors of road accidents, research is working towards removing control from the human driver by automating functions and finally introducing a fully Autonomous Vehicle (AV). A Collision Avoidance System (CAS) is one of the key safety systems for an AV, as it ensures all potential threats ahead of the vehicle are identified and appropriate action is taken. This research focuses on the task of vehicle detection, which is the base of a CAS, and attempts to produce an effective vehicle detector based on the data coming from a low-cost monocular camera. Developing a robust CAS based on low-cost sensor is crucial to bringing the cost of safety systems down and in this way, increase their adoption rate by end users. In this work, detectors are developed based on the two main approaches to vehicle detection using a monocular camera. The first is the traditional image processing approach where visual cues are utilised to generate potential vehicle locations and at a second stage, verify the existence of vehicles in an image. The second approach is based on a Convolutional Neural Network, a computationally expensive method that unifies the detection process in a single pipeline. The goal is to determine which method is more appropriate for real-time applications. Following the first approach, a vehicle detector based on the combination of HOG features and SVM classification is developed. The detector attempts to optimise performance by modifying the detection pipeline and improve run-time performance. For the CNN-based approach, six different network models are developed and trained end to end using collected data, each with a different network structure and parameters, in an attempt to determine which combination produces the best results. The evaluation of the different vehicle detectors produced some interesting findings; the first approach did not manage to produce a working detector, while the CNN-based approach produced a high performing vehicle detector with an 85.87% average precision and a very low miss rate. The detector managed to perform well under different operational environments (motorway, urban and rural roads) and the results were validated using an external dataset. Additional testing of the vehicle detector indicated it is suitable as a base for safety applications such as CAS, with a run time performance of 12FPS and potential for further improvements.</div

    Thai Finger-Spelling Recognition Using a Cascaded Classifier Based on Histogram of Orientation Gradient Features

    Get PDF
    Hand posture recognition is an essential module in applications such as human-computer interaction (HCI), games, and sign language systems, in which performance and robustness are the primary requirements. In this paper, we proposed automatic classification to recognize 21 hand postures that represent letters in Thai finger-spelling based on Histogram of Orientation Gradient (HOG) feature (which is applied with more focus on the information within certain region of the image rather than each single pixel) and Adaptive Boost (i.e., AdaBoost) learning technique to select the best weak classifier and to construct a strong classifier that consists of several weak classifiers to be cascaded in detection architecture. We collected 21 static hand posture images from 10 subjects for testing and training in Thai letters finger-spelling. The parameters for the training process have been adjusted in three experiments, false positive rates (FPR), true positive rates (TPR), and number of training stages (N), to achieve the most suitable training model for each hand posture. All cascaded classifiers are loaded into the system simultaneously to classify different hand postures. A correlation coefficient is computed to distinguish the hand postures that are similar. The system achieves approximately 78% accuracy on average on all classifier experiments

    Pedestrian detection in far infrared images

    Get PDF
    This paper presents an experimental study on pedestrian classification and detection in far infrared (FIR) images. The study includes an in-depth evaluation of several combinations of features and classifiers, which include features previously used for daylight scenarios, as well as a new descriptor (HOPE - Histograms of Oriented Phase Energy), specifically targeted to infrared images, and a new adaptation of a latent variable SVM approach to FIR images. The presented results are validated on a new classification and detection dataset of FIR images collected in outdoor environments from a moving vehicle. The classification space contains 16152 pedestrians and 65440 background samples evenly selected from several sequences acquired at different temperatures and different illumination conditions. The detection dataset consist on 15224 images with ground truth information. The authors are making this dataset public for benchmarking new detectors in the area of intelligent vehicles and field robotics applications.This work was supported by the Spanish Government through the Cicyt projects FEDORA (GRANT TRA2010-20225-C03-01) and Driver Distraction Detector System (GRANT TRA2011-29454- C03- 02), and the Comunidad de Madrid through the project SEGVAUTO (S2009/DPI-1509)

    Compound Models for Vision-Based Pedestrian Recognition

    Get PDF
    This thesis addresses the problem of recognizing pedestrians in video images acquired from a moving camera in real-world cluttered environments. Instead of focusing on the development of novel feature primitives or pattern classifiers, we follow an orthogonal direction and develop feature- and classifier-independent compound techniques which integrate complementary information from multiple image-based sources with the objective of improved pedestrian classification performance. After establishing a performance baseline in terms of a thorough experimental study on monocular pedestrian recognition, we investigate the use of multiple cues on module-level. A motion-based focus of attention stage is proposed based on a learned probabilistic pedestrian-specific model of motion features. The model is used to generate pedestrian localization hypotheses for subsequent shape- and texture-based classification modules. In the remainder of this work, we focus on the integration of complementary information directly into the pattern classification step. We present a combination of shape and texture information by means of pose-specific generative shape and texture models. The generative models are integrated with discriminative classification models by utilizing synthesized virtual pedestrian training samples from the former to enhance the classification performance of the latter. Both models are linked using Active Learning to guide the training process towards informative samples. A multi-level mixture-of-experts classification framework is proposed which involves local pose-specific expert classifiers operating on multiple image modalities and features. In terms of image modalities, we consider gray-level intensity, depth cues derived from dense stereo vision and motion cues arising from dense optical flow. We furthermore employ shape-based, gradient-based and texture-based features. The mixture-of-experts formulation compares favorably to joint space approaches, in view of performance and practical feasibility. Finally, we extend this mixture-of-experts framework in terms of multi-cue partial occlusion handling and the estimation of pedestrian body orientation. Our occlusion model involves examining occlusion boundaries which manifest in discontinuities in depth and motion space. Occlusion-dependent weights which relate to the visibility of certain body parts focus the decision on unoccluded body components. We further apply the pose-specific nature of our mixture-of-experts framework towards estimating the density of pedestrian body orientation from single images, again integrating shape and texture information. Throughout this work, particular emphasis is laid on thorough performance evaluation both regarding methodology and competitive real-world datasets. Several datasets used in this thesis are made publicly available for benchmarking purposes. Our results indicate significant performance boosts over state-of-the-art for all aspects considered in this thesis, i.e. pedestrian recognition, partial occlusion handling and body orientation estimation. The pedestrian recognition performance in particular is considerably advanced; false detections at constant detection rates are reduced by significantly more than an order of magnitude

    Object Detection in Omnidirectional Images

    Get PDF
    Nowadays, computer vision (CV) is widely used to solve real-world problems, which pose increasingly higher challenges. In this context, the use of omnidirectional video in a growing number of applications, along with the fast development of Deep Learning (DL) algorithms for object detection, drives the need for further research to improve existing methods originally developed for conventional 2D planar images. However, the geometric distortion that common sphere-to-plane projections produce, mostly visible in objects near the poles, in addition to the lack of omnidirectional open-source labeled image datasets has made an accurate spherical image-based object detection algorithm a hard goal to achieve. This work is a contribution to develop datasets and machine learning models particularly suited for omnidirectional images, represented in planar format through the well-known Equirectangular Projection (ERP). To this aim, DL methods are explored to improve the detection of visual objects in omnidirectional images, by considering the inherent distortions of ERP. An experimental study was, firstly, carried out to find out whether the error rate and type of detection errors were related to the characteristics of ERP images. Such study revealed that the error rate of object detection using existing DL models with ERP images, actually, depends on the object spherical location in the image. Then, based on such findings, a new object detection framework is proposed to obtain a uniform error rate across the whole spherical image regions. The results show that the pre and post-processing stages of the implemented framework effectively contribute to reducing the performance dependency on the image region, evaluated by the above-mentioned metric
    • …
    corecore