5,705 research outputs found
Staple: Complementary Learners for Real-Time Tracking
Correlation Filter-based trackers have recently achieved excellent
performance, showing great robustness to challenging situations exhibiting
motion blur and illumination changes. However, since the model that they learn
depends strongly on the spatial layout of the tracked object, they are
notoriously sensitive to deformation. Models based on colour statistics have
complementary traits: they cope well with variation in shape, but suffer when
illumination is not consistent throughout a sequence. Moreover, colour
distributions alone can be insufficiently discriminative. In this paper, we
show that a simple tracker combining complementary cues in a ridge regression
framework can operate faster than 80 FPS and outperform not only all entries in
the popular VOT14 competition, but also recent and far more sophisticated
trackers according to multiple benchmarks.Comment: To appear in CVPR 201
A Universal Update-pacing Framework For Visual Tracking
This paper proposes a novel framework to alleviate the model drift problem in
visual tracking, which is based on paced updates and trajectory selection.
Given a base tracker, an ensemble of trackers is generated, in which each
tracker's update behavior will be paced and then traces the target object
forward and backward to generate a pair of trajectories in an interval. Then,
we implicitly perform self-examination based on trajectory pair of each tracker
and select the most robust tracker. The proposed framework can effectively
leverage temporal context of sequential frames and avoid to learn corrupted
information. Extensive experiments on the standard benchmark suggest that the
proposed framework achieves superior performance against state-of-the-art
trackers.Comment: Submitted to ICIP 201
Online Object Tracking with Proposal Selection
Tracking-by-detection approaches are some of the most successful object
trackers in recent years. Their success is largely determined by the detector
model they learn initially and then update over time. However, under
challenging conditions where an object can undergo transformations, e.g.,
severe rotation, these methods are found to be lacking. In this paper, we
address this problem by formulating it as a proposal selection task and making
two contributions. The first one is introducing novel proposals estimated from
the geometric transformations undergone by the object, and building a rich
candidate set for predicting the object location. The second one is devising a
novel selection strategy using multiple cues, i.e., detection score and
edgeness score computed from state-of-the-art object edges and motion
boundaries. We extensively evaluate our approach on the visual object tracking
2014 challenge and online tracking benchmark datasets, and show the best
performance.Comment: ICCV 201
Anticipatory synergy adjustments reflect individual performance of feedforward force control
We grasp and dexterously manipulate an object through multi-1 digit synergy. In the framework of the uncontrolled manifold (UCM) hypothesis, multi-digit synergy is defined as the coordinated control mechanism of fingers to stabilize variable important for task success, e.g., total force. Previous studies reported anticipatory synergy adjustments (ASAs) that correspond to a drop of the synergy index before a quick change of the total force. The present study compared ASA’s properties with individual performances of feedforward force control to investigate a relationship of those. Subjects performed a total finger force production task that consisted of a phase in which subjects tracked target line with visual information and a phase in which subjects produced total force pulse without visual information. We quantified their multi-digit synergy through UCM analysis and observed significant ASAs before producing total force pulse. The time of the ASA initiation and the magnitude of the drop of the synergy index were significantly correlated with the error of force pulse, but not with the tracking error. Almost all subjects showed a significant increase of the variance that affected the total force. Our study directly showed that ASA reflects the individual performance of feedforward force control independently of target-tracking performance and suggests that the multi-digit synergy was weakened to adjust the multi-digit movements based on a prediction error so as to reduce the future error
Understanding and Diagnosing Visual Tracking Systems
Several benchmark datasets for visual tracking research have been proposed in
recent years. Despite their usefulness, whether they are sufficient for
understanding and diagnosing the strengths and weaknesses of different trackers
remains questionable. To address this issue, we propose a framework by breaking
a tracker down into five constituent parts, namely, motion model, feature
extractor, observation model, model updater, and ensemble post-processor. We
then conduct ablative experiments on each component to study how it affects the
overall result. Surprisingly, our findings are discrepant with some common
beliefs in the visual tracking research community. We find that the feature
extractor plays the most important role in a tracker. On the other hand,
although the observation model is the focus of many studies, we find that it
often brings no significant improvement. Moreover, the motion model and model
updater contain many details that could affect the result. Also, the ensemble
post-processor can improve the result substantially when the constituent
trackers have high diversity. Based on our findings, we put together some very
elementary building blocks to give a basic tracker which is competitive in
performance to the state-of-the-art trackers. We believe our framework can
provide a solid baseline when conducting controlled experiments for visual
tracking research
Long-Term Visual Object Tracking Benchmark
We propose a new long video dataset (called Track Long and Prosper - TLP) and
benchmark for single object tracking. The dataset consists of 50 HD videos from
real world scenarios, encompassing a duration of over 400 minutes (676K
frames), making it more than 20 folds larger in average duration per sequence
and more than 8 folds larger in terms of total covered duration, as compared to
existing generic datasets for visual tracking. The proposed dataset paves a way
to suitably assess long term tracking performance and train better deep
learning architectures (avoiding/reducing augmentation, which may not reflect
real world behaviour). We benchmark the dataset on 17 state of the art trackers
and rank them according to tracking accuracy and run time speeds. We further
present thorough qualitative and quantitative evaluation highlighting the
importance of long term aspect of tracking. Our most interesting observations
are (a) existing short sequence benchmarks fail to bring out the inherent
differences in tracking algorithms which widen up while tracking on long
sequences and (b) the accuracy of trackers abruptly drops on challenging long
sequences, suggesting the potential need of research efforts in the direction
of long-term tracking.Comment: ACCV 2018 (Oral
- …