3,930 research outputs found
Visual Object Tracking: The Initialisation Problem
Model initialisation is an important component of object tracking. Tracking
algorithms are generally provided with the first frame of a sequence and a
bounding box (BB) indicating the location of the object. This BB may contain a
large number of background pixels in addition to the object and can lead to
parts-based tracking algorithms initialising their object models in background
regions of the BB. In this paper, we tackle this as a missing labels problem,
marking pixels sufficiently away from the BB as belonging to the background and
learning the labels of the unknown pixels. Three techniques, One-Class SVM
(OC-SVM), Sampled-Based Background Model (SBBM) (a novel background model based
on pixel samples), and Learning Based Digital Matting (LBDM), are adapted to
the problem. These are evaluated with leave-one-video-out cross-validation on
the VOT2016 tracking benchmark. Our evaluation shows both OC-SVMs and SBBM are
capable of providing a good level of segmentation accuracy but are too
parameter-dependent to be used in real-world scenarios. We show that LBDM
achieves significantly increased performance with parameters selected by cross
validation and we show that it is robust to parameter variation.Comment: 15th Conference on Computer and Robot Vision (CRV 2018). Source code
available at https://github.com/georgedeath/initialisation-proble
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery
A robust and fast automatic moving object detection and tracking system is
essential to characterize target object and extract spatial and temporal
information for different functionalities including video surveillance systems,
urban traffic monitoring and navigation, robotic. In this dissertation, I
present a collaborative Spatial Pyramid Context-aware moving object detection
and Tracking system. The proposed visual tracker is composed of one master
tracker that usually relies on visual object features and two auxiliary
trackers based on object temporal motion information that will be called
dynamically to assist master tracker. SPCT utilizes image spatial context at
different level to make the video tracking system resistant to occlusion,
background noise and improve target localization accuracy and robustness. We
chose a pre-selected seven-channel complementary features including RGB color,
intensity and spatial pyramid of HoG to encode object color, shape and spatial
layout information. We exploit integral histogram as building block to meet the
demands of real-time performance. A novel fast algorithm is presented to
accurately evaluate spatially weighted local histograms in constant time
complexity using an extension of the integral histogram method. Different
techniques are explored to efficiently compute integral histogram on GPU
architecture and applied for fast spatio-temporal median computations and 3D
face reconstruction texturing. We proposed a multi-component framework based on
semantic fusion of motion information with projected building footprint map to
significantly reduce the false alarm rate in urban scenes with many tall
structures. The experiments on extensive VOTC2016 benchmark dataset and aerial
video confirm that combining complementary tracking cues in an intelligent
fusion framework enables persistent tracking for Full Motion Video and Wide
Aerial Motion Imagery.Comment: PhD Dissertation (162 pages
- …