8 research outputs found
Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters
Standard RGB-D trackers treat the target as an inherently 2D structure, which
makes modelling appearance changes related even to simple out-of-plane rotation
highly challenging. We address this limitation by proposing a novel long-term
RGB-D tracker - Object Tracking by Reconstruction (OTR). The tracker performs
online 3D target reconstruction to facilitate robust learning of a set of
view-specific discriminative correlation filters (DCFs). The 3D reconstruction
supports two performance-enhancing features: (i) generation of accurate spatial
support for constrained DCF learning from its 2D projection and (ii) point
cloud based estimation of 3D pose change for selection and storage of
view-specific DCFs which are used to robustly localize the target after
out-of-view rotation or heavy occlusion. Extensive evaluation of OTR on the
challenging Princeton RGB-D tracking and STC Benchmarks shows it outperforms
the state-of-the-art by a large margin
CDTB: A Color and Depth Visual Object Tracking Dataset and Benchmark
A long-term visual object tracking performance evaluation methodology and a
benchmark are proposed. Performance measures are designed by following a
long-term tracking definition to maximize the analysis probing strength. The
new measures outperform existing ones in interpretation potential and in better
distinguishing between different tracking behaviors. We show that these
measures generalize the short-term performance measures, thus linking the two
tracking problems. Furthermore, the new measures are highly robust to temporal
annotation sparsity and allow annotation of sequences hundreds of times longer
than in the current datasets without increasing manual annotation labor. A new
challenging dataset of carefully selected sequences with many target
disappearances is proposed. A new tracking taxonomy is proposed to position
trackers on the short-term/long-term spectrum. The benchmark contains an
extensive evaluation of the largest number of long-term tackers and comparison
to state-of-the-art short-term trackers. We analyze the influence of tracking
architecture implementations to long-term performance and explore various
re-detection strategies as well as influence of visual model update strategies
to long-term tracking drift. The methodology is integrated in the VOT toolkit
to automate experimental analysis and benchmarking and to facilitate future
development of long-term trackers
DAL: A Deep Depth-Aware Long-term Tracker
The best RGBD trackers provide high accuracy but are slow to run. On the other hand, the best RGB trackers are fast but clearly inferior on the RGBD datasets. In this work, we propose a deep depth-aware long-term tracker that achieves state-of-the-art RGBD tracking performance and is fast to run. We reformulate deep discriminative correlation filter (DCF) to embed the depth information into deep features. Moreover, the same depth-aware correlation filter is used for target redetection. Comprehensive evaluations show that the proposed tracker achieves state-of-the-art performance on the Princeton RGBD, STC, and the newly-released CDTB benchmarks and runs 20 fps.acceptedVersionPeer reviewe
NON-RIGID MULTI-BODY TRACKING IN RGBD STREAMS
To efficiently collect training data for an off-the-shelf object detector, we consider the problem of segmenting and tracking non-rigid objects from RGBD sequences by introducing the spatio-temporal matrix with very few assumptions – no prior object model and no stationary sensor. Spatial temporal matrix is able to encode not only spatial associations between multiple objects, but also component-level spatio temporal associations that allow the correction of falsely segmented objects in the presence of various types of interaction among multiple objects. Extensive experiments over complex human/animal body motions with occlusions and body part motions demonstrate that our approach substantially improves tracking robustness and segmentation accuracy
Vision and Depth Based Computerized Anthropometry and Object Tracking
The thesis has two interconnected parts: Computerized Anthropometry and RGBD (RGB plus Depth) object tracking. In the first part of this thesis, we start from the mathematical representation of the human body shape model. It briefly introduces prior arts from the classic human body models to the latest deep neural network based approaches. We describe the performance metrics and popular datasets for evaluating computerized anthropometry estimation algorithms in a unified setting. The first part of this thesis is about describing our contribution over two aspects of human body anthropometry research: 1) a statistical method for estimating anthropometric measurements from scans, and 2) a deep neural network based solution for learning anthropometric measurements from binary silhouettes. We also release two body shape datasets for accommodating data driven learning methods.
In the second part of this thesis, we explore RGBD object tracking. We start from the current state of RGBD tracking compared to RGB tracking and briefly introduce prior arts from engineered features based methods to deep neural network based methods. We present three deep learning based methods that integrate deep depth features into RGBD object tracking. We also release a unified RGBD tracking benchmark for data driven RGBD tracking algorithms. Finally, we explore RGBD tracking with deep depth features and demonstrate that depth cues significantly benefit the target model learning