1,161 research outputs found

    Vehicle detection and tracking using homography-based plane rectification and particle filtering

    Get PDF
    This paper presents a full system for vehicle detection and tracking in non-stationary settings based on computer vision. The method proposed for vehicle detection exploits the geometrical relations between the elements in the scene so that moving objects (i.e., vehicles) can be detected by analyzing motion parallax. Namely, the homography of the road plane between successive images is computed. Most remarkably, a novel probabilistic framework based on Kalman filtering is presented for reliable and accurate homography estimation. The estimated homography is used for image alignment, which in turn allows to detect the moving vehicles in the image. Tracking of vehicles is performed on the basis of a multidimensional particle filter, which also manages the exit and entries of objects. The filter involves a mixture likelihood model that allows a better adaptation of the particles to the observed measurements. The system is specially designed for highway environments, where it has been proven to yield excellent results

    A comprehensive review of vehicle detection using computer vision

    Get PDF
    A crucial step in designing intelligent transport systems (ITS) is vehicle detection. The challenges of vehicle detection in urban roads arise because of camera position, background variations, occlusion, multiple foreground objects as well as vehicle pose. The current study provides a synopsis of state-of-the-art vehicle detection techniques, which are categorized according to motion and appearance-based techniques starting with frame differencing and background subtraction until feature extraction, a more complicated model in comparison. The advantages and disadvantages among the techniques are also highlighted with a conclusion as to the most accurate one for vehicle detection

    Automatic vehicle detection and tracking in aerial video

    Get PDF
    This thesis is concerned with the challenging tasks of automatic and real-time vehicle detection and tracking from aerial video. The aim of this thesis is to build an automatic system that can accurately localise any vehicles that appear in aerial video frames and track the target vehicles with trackers. Vehicle detection and tracking have many applications and this has been an active area of research during recent years; however, it is still a challenge to deal with certain realistic environments. This thesis develops vehicle detection and tracking algorithms which enhance the robustness of detection and tracking beyond the existing approaches. The basis of the vehicle detection system proposed in this thesis has different object categorisation approaches, with colour and texture features in both point and area template forms. The thesis also proposes a novel Self-Learning Tracking and Detection approach, which is an extension to the existing Tracking Learning Detection (TLD) algorithm. There are a number of challenges in vehicle detection and tracking. The most difficult challenge of detection is distinguishing and clustering the target vehicle from the background objects and noises. Under certain conditions, the images captured from Unmanned Aerial Vehicles (UAVs) are also blurred; for example, turbulence may make the vehicle shake during flight. This thesis tackles these challenges by applying integrated multiple feature descriptors for real-time processing. In this thesis, three vehicle detection approaches are proposed: the HSV-GLCM feature approach, the ISM-SIFT feature approach and the FAST-HoG approach. The general vehicle detection approaches used have highly flexible implicit shape representations. They are based on training samples in both positive and negative sets and use updated classifiers to distinguish the targets. It has been found that the detection results attained by using HSV-GLCM texture features can be affected by blurring problems; the proposed detection algorithms can further segment the edges of the vehicles from the background. Using the point descriptor feature can solve the blurring problem, however, the large amount of information contained in point descriptors can lead to processing times that are too long for real-time applications. So the FAST-HoG approach combining the point feature and the shape feature is proposed. This new approach is able to speed up the process that attains the real-time performance. Finally, a detection approach using HoG with the FAST feature is also proposed. The HoG approach is widely used in object recognition, as it has a strong ability to represent the shape vector of the object. However, the original HoG feature is sensitive to the orientation of the target; this method improves the algorithm by inserting the direction vectors of the targets. For the tracking process, a novel tracking approach was proposed, an extension of the TLD algorithm, in order to track multiple targets. The extended approach upgrades the original system, which can only track a single target, which must be selected before the detection and tracking process. The greatest challenge to vehicle tracking is long-term tracking. The target object can change its appearance during the process and illumination and scale changes can also occur. The original TLD feature assumed that tracking can make errors during the tracking process, and the accumulation of these errors could cause tracking failure, so the original TLD proposed using a learning approach in between the tracking and the detection by adding a pair of inspectors (positive and negative) to constantly estimate errors. This thesis extends the TLD approach with a new detection method in order to achieve multiple-target tracking. A Forward and Backward Tracking approach has been proposed to eliminate tracking errors and other problems such as occlusion. The main purpose of the proposed tracking system is to learn the features of the targets during tracking and re-train the detection classifier for further processes. This thesis puts particular emphasis on vehicle detection and tracking in different extreme scenarios such as crowed highway vehicle detection, blurred images and changes in the appearance of the targets. Compared with currently existing detection and tracking approaches, the proposed approaches demonstrate a robust increase in accuracy in each scenario

    도심도로에서 자율주행차량의 라이다 기반 강건한 위치 및 자세 추정

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 공과대학 기계공학부, 2023. 2. 이경수.This paper presents a method for tackling erroneous odometry estimation results from LiDAR-based simultaneous localization and mapping (SLAM) techniques on complex urban roads. Most SLAM techniques estimate sensor odometry through a comparison between measurements from the current and the previous step. As such, a static environment is generally more advantageous for SLAM systems. However, urban environments contain a significant number of dynamic objects, the point clouds of which can noticeably hinder the performance of SLAM systems. As a countermeasure, this paper proposes a 3D LiDAR SLAM system based on static LiDAR point clouds for use in dynamic outdoor urban environments. The proposed method is primarily composed of two parts, moving object detection and pose estimation through 3D LiDAR SLAM. First, moving objects in the vicinity of the ego-vehicle are detected from a referred algorithm based on a geometric model-free approach (GMFA) and a static obstacle map (STOM). GMFA works in conjunction with STOM to estimate the state of moving objects in real-time. The bounding boxes occupied by these moving objects are utilized to remove points corresponding to dynamic objects in the raw LiDAR point clouds. The remaining static points are applied to LiDAR SLAM. The second part of the proposed method describes odometry estimation through referred LiDAR SLAM, LeGO-LOAM. The LeGO-LOAM, a feature-based LiDAR SLAM framework, converts LiDAR point clouds into range images, from which edge and planar points are extracted as features. The range images are further utilized in a preprocessing stage to improve the computation efficiency of the overall algorithm. Additionally, a 6-DOF transformation is utilized, the model equation of which can be obtained by setting a residual to be the distance between an extracted feature of the current step and the corresponding feature geometry of the previous step. The equation is optimized through the Levenberg-Marquardt method. Furthermore, GMFA and LeGO-LOAM operate in parallel to resolve computational delays associated with GMFA. Actual vehicle tests were conducted on urban roads through a test vehicle equipped with a 32-channel 3D LiDAR and a real-time kinematics GPS (RTK GPS). Validations results have shown the proposed method to significantly decrease estimation errors related to moving feature points while securing target output frequency.본 연구는 복잡한 도심 환경에서 라이다 기반 동시적 위치 추정 및 맵핑(Simultaneous localization and mapping, SLAM)의 이동량 추정 오류를 방지하는 방법론을 제안한다. 대부분의 SLAM은 이전 스텝과 현재 스텝의 센서 측정치를 비교하여 자차량의 이동량을 추정한다. 따라서 SLAM에는 정적인 환경이 필수적이다. 그러나 센서는 도심환경에서 동적인 물체에 쉽게 노출되고 동적 물체로부터 출력되는 라이다 점군들은 이동량 추정 성능을 저하시킬 수 있다. 이에, 본 연구는 동적인 도심환경에서 정적인 점군을 기반한 3차원 라이다 SLAM 시스템을 제안하였다. 제안된 방법론은 이동 물체 인지와 3차원 라이다 SLAM을 통한 위치 및 자세 추정으로 구성된다. 우선, 기하학적 모델 프리 접근법과 정지 장애물 맵의 상호 보완적인 관계에 기반한 참고된 알고리즘을 이용해 자차량 주변의 이동 물체의 동적 상태를 실시간으로 추정한다. 그 후, 추정된 이동 물체가 차지하는 경계선을 이용하여 동적 물체에 해당하는 점들을 기존 라이다 점군에서 제거하고, 결과로 얻은 정적인 라이다 점군은 라이다 SLAM에 입력된다. 다음으로, 제안된 방법론은 라이다 SLAM을 통해 자차량의 위치 및 자세를 추정한다. 이를 위해 본 연구는 라이다 SLAM의 프레임워크인 LeGO-LOAM을 채택하였다. 특징점 기반 SLAM인 LeGO-LOAM은 라이다 점군을 거리 기반 이미지로 변환시켜 특징점인 모서리 점과 평면 점을 추출한다. 또한 거리 기반 이미지를 사용한 전처리 과정을 통해 계산 효율을 높인다. 추출된 현재 스텝의 특징점과 이에 대응되는 이전 스텝의 특징점으로 이루어진 기하학적 구조와의 거리를 잔차로 설정하여 6 자유도 변환식에 대한 모델 방정식을 얻을 수 있다. 참고한 LeGO-LOAM은 해당 방정식을 Levenberg-Marquardt 방법을 통해 최적화를 수행한다. 또한, 본 연구는 참고된 인지 모듈의 처리 지연 문제를 보완하기 위해 이동 물체 인지 모듈과 LeGO-LOAM의 병렬 처리 구조를 고안하였다. 실험은 도심환경에서 32채널 3차원 라이다와 고정밀 GPS를 장착한 실험차량으로 진행되었다. 성능 검증 결과, 제안된 방법은 목표 출력 속도를 보장하면서 움직이는 특징점으로 인한 추정 오차를 유의미하게 줄일 수 있었다.Chapter 1. Introduction 1 1.1. Research Motivation 1 1.2. Previous Research 3 1.2.1. Moving Object Detection 3 1.2.2. SLAM 4 1.3. Thesis Objective and Outline 13 Chapter 2. Methodology 15 2.1. Moving Object Detection & Rejection 15 2.1.1. Static Obstacle Map 15 2.1.2. Geometric Model-Free Approach 18 2.2. LiDAR SLAM 22 2.2.1. Segmentation 22 2.2.2. Feature Extraction 23 2.2.3. LiDAR Odometry and Mapping 26 2.2.4. LiDAR SLAM with Static Point Cloud 28 Chapter 3. Experiments 30 3.1. Experimental Setup 30 3.2. Error Metrics 32 3.3. LiDAR SLAM using Static Point Cloud 36 Chapter 4. Conclusion 44 Bibliography 45석

    Towards vision based navigation in large indoor environments

    Full text link
    The main contribution of this paper is a novel stereo-based algorithm which serves as a tool to examine the viability of stereo vision solutions to the simultaneous localisation and mapping (SLAM) for large indoor environments. Using features extracted from the scale invariant feature transform (SIFT) and depth maps from a small vision system (SVS) stereo head, an extended Kalman fllter (EKF) based SLAM algorithm, that allows the independent use of information relating to depth and bearing, is developed. By means of a map pruning strategy for managing the computational cost, it is demonstrated that statistically consistent location estimates can be generated for a small (6 m × 6 m) structured office environment, and in a robotics search and rescue arena of similar size. It is shown that in a larger office environment, the proposed algorithm generates location estimates which are topologically correct, but statistically inconsistent. A discussion on the possible reasons for the inconsistency is presented. The paper highlights that, despite recent advances, building accurate geometric maps of large environments with vision only sensing is still a challenging task. ©2006 IEEE

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

    Counting and Classification of Highway Vehicles by Regression Analysis

    Get PDF
    In this paper, we describe a novel algorithm that counts and classifies highway vehicles based on regression analysis. This algorithm requires no explicit segmentation or tracking of individual vehicles, which is usually an important part of many existing algorithms. Therefore, this algorithm is particularly useful when there are severe occlusions or vehicle resolution is low, in which extracted features are highly unreliable. There are mainly two contributions in our proposed algorithm. First, a warping method is developed to detect the foreground segments that contain unclassified vehicles. The common used modeling and tracking (e.g., Kalman filtering) of individual vehicles are not required. In order to reduce vehicle distortion caused by the foreshortening effect, a nonuniform mesh grid and a projective transformation are estimated and applied during the warping process. Second, we extract a set of low-level features for each foreground segment and develop a cascaded regression approach to count and classify vehicles directly, which has not been used in the area of intelligent transportation systems. Three different regressors are designed and evaluated. Experiments show that our regression-based algorithm is accurate and robust for poor quality videos, from which many existing algorithms could fail to extract reliable features

    Real-time object detection using monocular vision for low-cost automotive sensing systems

    Get PDF
    This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain

    A robust and efficient video representation for action recognition

    Get PDF
    This paper introduces a state-of-the-art video representation and applies it to efficient action recognition and detection. We first propose to improve the popular dense trajectory features by explicit camera motion estimation. More specifically, we extract feature point matches between frames using SURF descriptors and dense optical flow. The matches are used to estimate a homography with RANSAC. To improve the robustness of homography estimation, a human detector is employed to remove outlier matches from the human body as human motion is not constrained by the camera. Trajectories consistent with the homography are considered as due to camera motion, and thus removed. We also use the homography to cancel out camera motion from the optical flow. This results in significant improvement on motion-based HOF and MBH descriptors. We further explore the recent Fisher vector as an alternative feature encoding approach to the standard bag-of-words histogram, and consider different ways to include spatial layout information in these encodings. We present a large and varied set of evaluations, considering (i) classification of short basic actions on six datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that our improved trajectory features significantly outperform previous dense trajectories, and that Fisher vectors are superior to bag-of-words encodings for video recognition tasks. In all three tasks, we show substantial improvements over the state-of-the-art results
    corecore