18 research outputs found

    A comprehensive review of vehicle detection using computer vision

    Get PDF
    A crucial step in designing intelligent transport systems (ITS) is vehicle detection. The challenges of vehicle detection in urban roads arise because of camera position, background variations, occlusion, multiple foreground objects as well as vehicle pose. The current study provides a synopsis of state-of-the-art vehicle detection techniques, which are categorized according to motion and appearance-based techniques starting with frame differencing and background subtraction until feature extraction, a more complicated model in comparison. The advantages and disadvantages among the techniques are also highlighted with a conclusion as to the most accurate one for vehicle detection

    Histograms of Oriented 3D Gradients for Fully Automated Fetal Brain Localization and Robust Motion Correction in 3 T Magnetic Resonance Images

    Get PDF
    Fetal brain magnetic resonance imaging (MRI) is a rapidly emerging diagnostic imaging tool. However, automated fetal brain localization is one of the biggest obstacles in expediting and fully automating large-scale fetal MRI processing. We propose a method for automatic localization of fetal brain in 3 T MRI when the images are acquired as a stack of 2D slices that are misaligned due to fetal motion. First, the Histogram of Oriented Gradients (HOG) feature descriptor is extended from 2D to 3D images. Then, a sliding window is used to assign a score to all possible windows in an image, depending on the likelihood of it containing a brain, and the window with the highest score is selected. In our evaluation experiments using a leave-one-out cross-validation strategy, we achieved 96% of complete brain localization using a database of 104 MRI scans at gestational ages between 34 and 38 weeks. We carried out comparisons against template matching and random forest based regression methods and the proposed method showed superior performance. We also showed the application of the proposed method in the optimization of fetal motion correction and how it is essential for the reconstruction process. The method is robust and does not rely on any prior knowledge of fetal brain development

    VIFECO: An Open-Source Software for Counting Features on a Video

    Get PDF
    The aim of this article is to describe an open-source application (Vifeco) that makes it possible to manually identify features on a video. Vifeco also allows to: manage the number of users, create a category (feature) and a collection of categories, read video and identify the features on it, and analyze the counting concordance between two users. Written in Java 11 with the JavaFX UI toolkit, Vifeco is a stand-alone, multiplatform (Windows, Mac and Linux) and multi-language (3 languages supported) application. The software is available under Apache Licence on GitHub ('https://github.com/LAEQ/vifeco')

    Supervised learning and inference of semantic information from road scene images

    Get PDF
    Premio Extraordinario de Doctorado de la UAH en el año académico 2013-2014Nowadays, vision sensors are employed in automotive industry to integrate advanced functionalities that assist humans while driving. However, autonomous vehicles is a hot field of research both in academic and industrial sectors and entails a step beyond ADAS. Particularly, several challenges arise from autonomous navigation in urban scenarios due to their naturalistic complexity in terms of structure and dynamic participants (e.g. pedestrians, vehicles, vegetation, etc.). Hence, providing image understanding capabilities to autonomous robotics platforms is an essential target because cameras can capture the 3D scene as perceived by a human. In fact, given this need for 3D scene understanding, there is an increasing interest on joint objects and scene labeling in the form of geometry and semantic inference of the relevant entities contained in urban environments. In this regard, this Thesis tackles two challenges: 1) the prediction of road intersections geometry and, 2) the detection and orientation estimation of cars, pedestrians and cyclists. Different features extracted from stereo images of the KITTI public urban dataset are employed. This Thesis proposes a supervised learning of discriminative models that rely on strong machine learning techniques for data mining visual features. For the first task, we use 2D occupancy grid maps that are built from the stereo sequences captured by a moving vehicle in a mid-sized city. Based on these bird?s eye view images, we propose a smart parameterization of the layout of straight roads and 4 intersecting roads. The dependencies between the proposed discrete random variables that define the layouts are represented with Probabilistic Graphical Models. Then, the problem is formulated as a structured prediction, in which we employ Conditional Random Fields (CRF) for learning and convex Belief Propagation (dcBP) and Branch and Bound (BB) for inference. For the validation of the proposed methodology, a set of tests are carried out, which are based on real images and synthetic images with varying levels of random noise. In relation to the object detection and orientation estimation challenge in road scenes, this Thesis goal is to compete in the international challenge known as KITTI evaluation benchmark, which encourages researchers to push forward the current state of the art on visual recognition methods, particularized for 3D urban scene understanding. This Thesis proposes to modify the successful part-based object detector known as DPM in order to learn richer models from 2.5D data (color and disparity). Therefore, we revisit the DPM framework, which is based on HOG features and mixture models trained with a latent SVM formulation. Next, this Thesis performs a set of modifications on top of DPM: I) An extension to the DPM training pipeline that accounts for 3D-aware features. II) A detailed analysis of the supervised parameter learning. III) Two additional approaches: "feature whitening" and "stereo consistency check". Additionally, a) we analyze the KITTI dataset and several subtleties regarding to the evaluation protocol; b) a large set of cross-validated experiments show the performance of our contributions and, c) finally, our best performing approach is publicly ranked on the KITTI website, being the first one that reports results with stereo data, yielding an increased object detection precision (3%-6%) for the class 'car' and ranking first for the class cyclist

    Risk analysis for smart homes and domestic robots using robust shape and physics descriptors, and complex boosting techniques

    Get PDF
    In this paper, the notion of risk analysis within 3D scenes using vision based techniques is introduced. In particular the problem of risk estimation of indoor environments at the scene and object level is considered, with applications in domestic robots and smart homes. To this end, the proposed Risk Estimation Framework is described, which provides a quantified risk score for a given scene. This methodology is extended with the introduction of a novel robust kernel for 3D shape descriptors such as 3D HOG and SIFT3D, which aims to reduce the effects of outliers in the proposed risk recognition methodology. The Physics Behaviour Feature (PBF) is presented, which uses an object's angular velocity obtained using Newtonian physics simulation as a descriptor. Furthermore, an extension of boosting techniques for learning is suggested in the form of the novel Complex and Hyper-Complex Adaboost, which greatly increase the computation efficiency of the original technique. In order to evaluate the proposed robust descriptors an enriched version of the 3D Risk Scenes (3DRS) dataset with extra objects, scenes and meta-data was utilised. A comparative study was conducted demonstrating that the suggested approach outperforms current state-of-the-art descriptors

    Supervised learning and inference of semantic information from road scene images

    Get PDF
    Premio Extraordinario de Doctorado de la UAH en el año académico 2013-2014Nowadays, vision sensors are employed in automotive industry to integrate advanced functionalities that assist humans while driving. However, autonomous vehicles is a hot field of research both in academic and industrial sectors and entails a step beyond ADAS. Particularly, several challenges arise from autonomous navigation in urban scenarios due to their naturalistic complexity in terms of structure and dynamic participants (e.g. pedestrians, vehicles, vegetation, etc.). Hence, providing image understanding capabilities to autonomous robotics platforms is an essential target because cameras can capture the 3D scene as perceived by a human. In fact, given this need for 3D scene understanding, there is an increasing interest on joint objects and scene labeling in the form of geometry and semantic inference of the relevant entities contained in urban environments. In this regard, this Thesis tackles two challenges: 1) the prediction of road intersections geometry and, 2) the detection and orientation estimation of cars, pedestrians and cyclists. Different features extracted from stereo images of the KITTI public urban dataset are employed. This Thesis proposes a supervised learning of discriminative models that rely on strong machine learning techniques for data mining visual features. For the first task, we use 2D occupancy grid maps that are built from the stereo sequences captured by a moving vehicle in a mid-sized city. Based on these bird?s eye view images, we propose a smart parameterization of the layout of straight roads and 4 intersecting roads. The dependencies between the proposed discrete random variables that define the layouts are represented with Probabilistic Graphical Models. Then, the problem is formulated as a structured prediction, in which we employ Conditional Random Fields (CRF) for learning and convex Belief Propagation (dcBP) and Branch and Bound (BB) for inference. For the validation of the proposed methodology, a set of tests are carried out, which are based on real images and synthetic images with varying levels of random noise. In relation to the object detection and orientation estimation challenge in road scenes, this Thesis goal is to compete in the international challenge known as KITTI evaluation benchmark, which encourages researchers to push forward the current state of the art on visual recognition methods, particularized for 3D urban scene understanding. This Thesis proposes to modify the successful part-based object detector known as DPM in order to learn richer models from 2.5D data (color and disparity). Therefore, we revisit the DPM framework, which is based on HOG features and mixture models trained with a latent SVM formulation. Next, this Thesis performs a set of modifications on top of DPM: I) An extension to the DPM training pipeline that accounts for 3D-aware features. II) A detailed analysis of the supervised parameter learning. III) Two additional approaches: "feature whitening" and "stereo consistency check". Additionally, a) we analyze the KITTI dataset and several subtleties regarding to the evaluation protocol; b) a large set of cross-validated experiments show the performance of our contributions and, c) finally, our best performing approach is publicly ranked on the KITTI website, being the first one that reports results with stereo data, yielding an increased object detection precision (3%-6%) for the class 'car' and ranking first for the class cyclist

    Analysis of infrared polarisation signatures for vehicle detection

    Get PDF
    Thermal radiation emitted from objects within a scene tends to be partially polarised in a direction parallel to the surface normal, to an extent governed by properties of the surface material. This thesis investigates whether vehicle detection algorithms can be improved by the additional measurement of polarisation state as well as intensity in the long wave infrared. Knowledge about the polarimetric properties of scenes guides the development of histogram based and cluster based descriptors which are used in a traditional classification framework. The best performing histogram based method, the Polarimetric Histogram, which forms a descriptor based on the polarimetric vehicle signature is shown to outperform the standard Histogram of Oriented Gradients descriptor which uses intensity imagery alone. These descriptors then lead to a novel clustering algorithm which, at a false positive rate of 10−2 is shown to improve upon the Polarimetric Histogram descriptor, increasing the true positive rate from 0.19 to 0.63. In addition, a multi-modal detection framework which combines thermal intensity hotspot and polarimetric hotspot detections with a local motion detector is presented. Through the combination of these detectors, the false positive rate is shown to be reduced when compared to the result of individual detectors in isolation

    Counting and Classification of Highway Vehicles by Regression Analysis

    Get PDF
    In this paper, we describe a novel algorithm that counts and classifies highway vehicles based on regression analysis. This algorithm requires no explicit segmentation or tracking of individual vehicles, which is usually an important part of many existing algorithms. Therefore, this algorithm is particularly useful when there are severe occlusions or vehicle resolution is low, in which extracted features are highly unreliable. There are mainly two contributions in our proposed algorithm. First, a warping method is developed to detect the foreground segments that contain unclassified vehicles. The common used modeling and tracking (e.g., Kalman filtering) of individual vehicles are not required. In order to reduce vehicle distortion caused by the foreshortening effect, a nonuniform mesh grid and a projective transformation are estimated and applied during the warping process. Second, we extract a set of low-level features for each foreground segment and develop a cascaded regression approach to count and classify vehicles directly, which has not been used in the area of intelligent transportation systems. Three different regressors are designed and evaluated. Experiments show that our regression-based algorithm is accurate and robust for poor quality videos, from which many existing algorithms could fail to extract reliable features
    corecore