2,290 research outputs found

    Advanced framework for microscopic and lane‐level macroscopic traffic parameters estimation from UAV video

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/166282/1/itr2bf00873.pd

    Crowd detection and counting using a static and dynamic platform: state of the art

    Get PDF
    Automated object detection and crowd density estimation are popular and important area in visual surveillance research. The last decades witnessed many significant research in this field however, it is still a challenging problem for automatic visual surveillance. The ever increase in research of the field of crowd dynamics and crowd motion necessitates a detailed and updated survey of different techniques and trends in this field. This paper presents a survey on crowd detection and crowd density estimation from moving platform and surveys the different methods employed for this purpose. This review category and delineates several detections and counting estimation methods that have been applied for the examination of scenes from static and moving platforms

    AERIAL SURVEILLANCE FOR VEHICLE DETECTION USING DBN AND CANNY EDGE DETECTOR

    Get PDF
    We present an automatic vehicle detection system for aerial surveillance in this paper. In this system, we escape from the stereotype and existing frameworks of vehicle detection in aerial surveillance, which are either region based or sliding window based. We design a pixel wise classification method for vehicle detection. The novelty lies in the fact that, in spite of performing pixel wise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. We consider features including vehicle colors and local features. For vehicle color extraction, we utilize a color transform to separate vehicle colors and non-vehicle colors effectively. For edge detection, we apply moment preserving to adjust the thresholds of the Canny edge detector automatically, which increases the adaptability and the accuracy for detection in various aerial images. Afterward, a dynamic Bayesian network (DBN) is constructed for the classification purpose. We convert regional local features into quantitative observations that can be referenced when applying pixel wise classification via DBN. Experiments were conducted on a wide variety of aerial videos. The results demonstrate flexibility and good generalization abilities of the proposed method on a challenging data set with aerial surveillance images taken at different heights and under different camera angles

    Application of Multi-Sensor Fusion Technology in Target Detection and Recognition

    Get PDF
    Application of multi-sensor fusion technology has drawn a lot of industrial and academic interest in recent years. The multi-sensor fusion methods are widely used in many applications, such as autonomous systems, remote sensing, video surveillance, and the military. These methods can obtain the complementary properties of targets by considering multiple sensors. On the other hand, they can achieve a detailed environment description and accurate detection of interest targets based on the information from different sensors.This book collects novel developments in the field of multi-sensor, multi-source, and multi-process information fusion. Articles are expected to emphasize one or more of the three facets: architectures, algorithms, and applications. Published papers dealing with fundamental theoretical analyses, as well as those demonstrating their application to real-world problems

    Object Tracking and Mensuration in Surveillance Videos

    Get PDF
    This thesis focuses on tracking and mensuration in surveillance videos. The first part of the thesis discusses several object tracking approaches based on the different properties of tracking targets. For airborne videos, where the targets are usually small and with low resolutions, an approach of building motion models for foreground/background proposed in which the foreground target is simplified as a rigid object. For relatively high resolution targets, the non-rigid models are applied. An active contour-based algorithm has been introduced. The algorithm is based on decomposing the tracking into three parts: estimate the affine transform parameters between successive frames using particle filters; detect the contour deformation using a probabilistic deformation map, and regulate the deformation by projecting the updated model onto a trained shape subspace. The active appearance Markov chain (AAMC). It integrates a statistical model of shape, appearance and motion. In the AAMC model, a Markov chain represents the switching of motion phases (poses), and several pairwise active appearance model (P-AAM) components characterize the shape, appearance and motion information for different motion phases. The second part of the thesis covers video mensuration, in which we have proposed a heightmeasuring algorithm with less human supervision, more flexibility and improved robustness. From videos acquired by an uncalibrated stationary camera, we first recover the vanishing line and the vertical point of the scene. We then apply a single view mensuration algorithm to each of the frames to obtain height measurements. Finally, using the LMedS as the cost function and the Robbins-Monro stochastic approximation (RMSA) technique to obtain the optimal estimate

    DroTrack: High-speed Drone-based Object Tracking Under Uncertainty

    Full text link
    We present DroTrack, a high-speed visual single-object tracking framework for drone-captured video sequences. Most of the existing object tracking methods are designed to tackle well-known challenges, such as occlusion and cluttered backgrounds. The complex motion of drones, i.e., multiple degrees of freedom in three-dimensional space, causes high uncertainty. The uncertainty problem leads to inaccurate location predictions and fuzziness in scale estimations. DroTrack solves such issues by discovering the dependency between object representation and motion geometry. We implement an effective object segmentation based on Fuzzy C Means (FCM). We incorporate the spatial information into the membership function to cluster the most discriminative segments. We then enhance the object segmentation by using a pre-trained Convolution Neural Network (CNN) model. DroTrack also leverages the geometrical angular motion to estimate a reliable object scale. We discuss the experimental results and performance evaluation using two datasets of 51,462 drone-captured frames. The combination of the FCM segmentation and the angular scaling increased DroTrack precision by up to 9%9\% and decreased the centre location error by 162162 pixels on average. DroTrack outperforms all the high-speed trackers and achieves comparable results in comparison to deep learning trackers. DroTrack offers high frame rates up to 1000 frame per second (fps) with the best location precision, more than a set of state-of-the-art real-time trackers.Comment: 10 pages, 12 figures, FUZZ-IEEE 202

    Application of 2D Homography for High Resolution Traffic Data Collection using CCTV Cameras

    Full text link
    Traffic cameras remain the primary source data for surveillance activities such as congestion and incident monitoring. To date, State agencies continue to rely on manual effort to extract data from networked cameras due to limitations of the current automatic vision systems including requirements for complex camera calibration and inability to generate high resolution data. This study implements a three-stage video analytics framework for extracting high-resolution traffic data such vehicle counts, speed, and acceleration from infrastructure-mounted CCTV cameras. The key components of the framework include object recognition, perspective transformation, and vehicle trajectory reconstruction for traffic data collection. First, a state-of-the-art vehicle recognition model is implemented to detect and classify vehicles. Next, to correct for camera distortion and reduce partial occlusion, an algorithm inspired by two-point linear perspective is utilized to extracts the region of interest (ROI) automatically, while a 2D homography technique transforms the CCTV view to bird's-eye view (BEV). Cameras are calibrated with a two-layer matrix system to enable the extraction of speed and acceleration by converting image coordinates to real-world measurements. Individual vehicle trajectories are constructed and compared in BEV using two time-space-feature-based object trackers, namely Motpy and BYTETrack. The results of the current study showed about +/- 4.5% error rate for directional traffic counts, less than 10% MSE for speed bias between camera estimates in comparison to estimates from probe data sources. Extracting high-resolution data from traffic cameras has several implications, ranging from improvements in traffic management and identify dangerous driving behavior, high-risk areas for accidents, and other safety concerns, enabling proactive measures to reduce accidents and fatalities.Comment: 25 pages, 9 figures, this paper was submitted for consideration for presentation at the 102nd Annual Meeting of the Transportation Research Board, January 202

    Deep Learning Methods for 3D Aerial and Satellite Data

    Get PDF
    Recent advances in digital electronics have led to an overabundance of observations from electro-optical (EO) imaging sensors spanning high spatial, spectral and temporal resolution. This unprecedented volume, variety, and velocity is overwhelming our capacity to manage and translate that data into actionable information. Although decades of image processing research have taken the human out of the loop for many important tasks, the human analyst is still an irreplaceable link in the image exploitation chain, especially for more complex tasks requiring contextual understanding, memory, discernment, and learning. If knowledge discovery is to keep pace with the growing availability of data, new processing paradigms are needed in order to automate the analysis of earth observation imagery and ease the burden of manual interpretation. To address this gap, this dissertation advances fundamental and applied research in deep learning for aerial and satellite imagery. We show how deep learning---a computational model inspired by the human brain---can be used for (1) tracking, (2) classifying, and (3) modeling from a variety of data sources including full-motion video (FMV), Light Detection and Ranging (LiDAR), and stereo photogrammetry. First we assess the ability of a bio-inspired tracking method to track small targets using aerial videos. The tracker uses three kinds of saliency maps: appearance, location, and motion. Our approach achieves the best overall performance, including being the only method capable of handling long-term occlusions. Second, we evaluate the classification accuracy of a multi-scale fully convolutional network to label individual points in LiDAR data. Our method uses only the 3D-coordinates and corresponding low-dimensional spectral features for each point. Evaluated using the ISPRS 3D Semantic Labeling Contest, our method scored second place with an overall accuracy of 81.6\%. Finally, we validate the prediction capability of our neighborhood-aware network to model the bare-earth surface of LiDAR and stereo photogrammetry point clouds. The network bypasses traditionally-used ground classifications and seamlessly integrate neighborhood features with point-wise and global features to predict a per point Digital Terrain Model (DTM). We compare our results with two widely used softwares for DTM extraction, ENVI and LAStools. Together, these efforts have the potential to alleviate the manual burden associated with some of the most challenging and time-consuming geospatial processing tasks, with implications for improving our response to issues of global security, emergency management, and disaster response
    corecore