9 research outputs found
A Novel Method on Video Segmentation based Object Detection using Background Subtraction Technique
Video segmentation is a process of dividing a movie into meaningful segments. It helps in the process of the detection of moving objects within a scene which play a vital role in many application such as Surveillance, Safety, Traffic monitoring and Object detection, etc., Especially, Background subtraction methods are widely used for moving object detection in videos. In this paper, a new method has been proposed for object detection using background subtraction and thresholding based segmentation algorithms.Experimental results proved that the proposed method achieved high accuracy rate than other existing techniques
Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV
Unmanned Aerial Vehicles (UAVs), have intrigued different people from all
walks of life, because of their pervasive computing capabilities. UAV equipped
with vision techniques, could be leveraged to establish navigation autonomous
control for UAV itself. Also, object detection from UAV could be used to
broaden the utilization of drone to provide ubiquitous surveillance and
monitoring services towards military operation, urban administration and
agriculture management. As the data-driven technologies evolved, machine
learning algorithm, especially the deep learning approach has been intensively
utilized to solve different traditional computer vision research problems.
Modern Convolutional Neural Networks based object detectors could be divided
into two major categories: one-stage object detector and two-stage object
detector. In this study, we utilize some representative CNN based object
detectors to execute the computer vision task over Stanford Drone Dataset
(SDD). State-of-the-art performance has been achieved in utilizing focal loss
dense detector RetinaNet based approach for object detection from UAV in a fast
and accurate manner.Comment: arXiv admin note: substantial text overlap with arXiv:1803.0111
Gaussian mixture model classifiers for detection and tracking in UAV video streams.
Masters Degree. University of KwaZulu-Natal, Durban.Manual visual surveillance systems are subject to a high degree of human-error and operator fatigue. The automation of such systems often employs detectors, trackers and classifiers as fundamental building blocks. Detection, tracking and classification are especially useful and challenging in Unmanned Aerial Vehicle (UAV) based surveillance systems. Previous solutions have addressed challenges via complex classification methods. This dissertation proposes less complex Gaussian Mixture Model (GMM) based classifiers that can simplify the process; where data is represented as a reduced set of model parameters, and classification is performed in the low dimensionality parameter-space. The specification and adoption of GMM based classifiers on the UAV visual tracking feature space formed the principal contribution of the work. This methodology can be generalised to other feature spaces.
This dissertation presents two main contributions in the form of submissions to ISI accredited journals. In the first paper, objectives are demonstrated with a vehicle detector incorporating a two stage GMM classifier, applied to a single feature space, namely Histogram of Oriented Gradients (HoG). While the second paper demonstrates objectives with a vehicle tracker using colour histograms (in RGB and HSV), with Gaussian Mixture Model (GMM) classifiers and a Kalman filter.
The proposed works are comparable to related works with testing performed on benchmark datasets. In the tracking domain for such platforms, tracking alone is insufficient. Adaptive detection and classification can assist in search space reduction, building of knowledge priors and improved target representations. Results show that the proposed approach improves performance and robustness. Findings also indicate potential further enhancements such as a multi-mode tracker with global and local tracking based on a combination of both papers
MOTION DETECTION IN MOVING BACKGROUND USING ORB FEATURE MATCHING AND AFFINE TRANSFORM
Visual surveillance systems have gained a lot of interest in the last few years due to its importance in military application and security. Surveillance cameras are installed in security sensitive areas such as banks, train stations, highways, and borders. In computer vision, moving object detection and tracking methods are the most important preliminary steps for higher-level video analysis applications. Moving objects in moving background are an important research area of image-video processing and computer vision. Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. ORB is used for feature detection and tracking. The objective is to track the moving objects in a moving video. Oriented Fast and Rotated Brief (ORB) which is a combination of two major techniques: Features from Accelerated Segment Test (FAST) and Binary Robust Independent Elementary Features (BRIEF).Mismatched features between two frames are rejected by the proposed method for a good accuracy of compensation. The Residues are removed using Logic AND Operation. To validate the proposed method, and to perform experiments that compare the properties of the proposed method to Scale Invariant Feature Transform (SIFT) based method and Speeded-Up Robust Features (SURF) based method, for both detecting accuracy and efficiency
Hybrid Video Stabilization for Mobile Vehicle Detection on SURF in Aerial Surveillance
Detection of moving vehicles in aerial video sequences is of great importance with many promising applications in surveillance, intelligence transportation, or public service applications such as emergency evacuation and policy security. However, vehicle detection is a challenging task due to global camera motion, low resolution of vehicles, and low contrast between vehicles and background. In this paper, we present a hybrid method to efficiently detect moving vehicle in aerial videos. Firstly, local feature extraction and matching were performed to estimate the global motion. It was demonstrated that the Speeded Up Robust Feature (SURF) key points were more suitable for the stabilization task. Then, a list of dynamic pixels was obtained and grouped for different moving vehicles by comparing the different optical flow normal. To enhance the precision of detection, some preprocessing methods were applied to the surveillance system, such as road extraction and other features. A quantitative evaluation on real video sequences indicated that the proposed method improved the detection performance significantly
Advanced framework for microscopic and lane‐level macroscopic traffic parameters estimation from UAV video
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/166282/1/itr2bf00873.pd
Detection, segmentation, and tracking of moving objects in UAV videos
Automatic processing of videos coming from small UAVs offers high potential for advanced surveillance applications but is also very challenging. These challenges include camera motion, high object distance, varying object background, multiple objects near to each other, weak signalto-noise-ratio (SNR), or compression artifacts. In this paper, a video processing chain for detection, segmentation, and tracking of multiple moving objects is presented dealing with the mentioned challenges. The fundament is the detection of local image features, which are not stationary. By clustering these features and subsequent object segmentation, regions are generated representing object hypotheses. Multi-object tracking is introduced using a Kalman filter and considering the camera motion. Split or merged object regions are handled by fusion of the regions and the local features. Finally, a quantitative evaluation of object segmentation and tracking is provided
Efficient multi-level scene understanding in videos
Automatic video parsing is a key step towards human-level dynamic
scene understanding, and a fundamental problem in computer
vision.
A core issue in video understanding is to infer multiple scene
properties of a video in an efficient and consistent manner. This
thesis addresses the problem of holistic scene understanding from
monocular videos, which jointly reason about semantic and
geometric scene properties from multiple levels, including
pixelwise annotation of video frames, object instance
segmentation in spatio-temporal domain, and/or scene-level
description in terms of scene categories and layouts.
We focus on four main issues in the holistic video understanding:
1) what is the representation for consistent semantic and
geometric parsing of videos? 2) how do we integrate high-level
reasoning (e.g., objects) with pixel-wise video parsing? 3) how
can we do efficient inference for multi-level video
understanding? and 4) what is the representation learning
strategy for efficient/cost-aware scene parsing?
We discuss three multi-level video scene segmentation scenarios
based on different aspects of scene properties and efficiency
requirements. The first case addresses the problem of consistent
geometric and semantic video segmentation for outdoor scenes.
We propose a geometric scene layout representation, or a stage
scene model, to efficiently capture the dependency between the
semantic and geometric labels.
We build a unified conditional random field for joint modeling of
the semantic class, geometric label and the stage representation,
and design an alternating inference algorithm to minimize the
resulting energy function. The second case focuses on the problem
of simultaneous pixel-level and object-level segmentation in
videos. We propose to incorporate foreground object information
into pixel labeling by jointly reasoning semantic labels of
supervoxels, object instance tracks and geometric relations
between objects. In order to model objects, we take an exemplar
approach based on a small set of object annotations to generate
a set of object proposals. We then design a conditional random
field framework that jointly models the supervoxel labels and
object instance segments. To scale up our method, we develop an
active inference strategy to improve the efficiency of
multi-level video parsing, which adaptively selects an
informative subset of object proposals and performs inference on
the resulting compact model.
The last case explores the problem of learning a flexible
representation for efficient scene labeling. We propose a dynamic
hierarchical model that allows us to achieve flexible trade-offs
between efficiency and accuracy. Our approach incorporates the
cost of feature computation and model inference, and optimizes
the model performance for any given test-time budget. We evaluate
all our methods on several publicly available video and image
semantic segmentation datasets, and demonstrate superior
performance in efficiency and accuracy.
Keywords: Semantic video segmentation, Multi-level scene
understanding, Efficient inference, Cost-aware scene parsin