18 research outputs found
BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video
Multiple existing benchmarks involve tracking and segmenting objects in video
e.g., Video Object Segmentation (VOS) and Multi-Object Tracking and
Segmentation (MOTS), but there is little interaction between them due to the
use of disparate benchmark datasets and metrics (e.g. J&F, mAP, sMOTSA). As a
result, published works usually target a particular benchmark, and are not
easily comparable to each another. We believe that the development of
generalized methods that can tackle multiple tasks requires greater cohesion
among these research sub-communities. In this paper, we aim to facilitate this
by proposing BURST, a dataset which contains thousands of diverse videos with
high-quality object masks, and an associated benchmark with six tasks involving
object tracking and segmentation in video. All tasks are evaluated using the
same data and comparable metrics, which enables researchers to consider them in
unison, and hence, more effectively pool knowledge from different methods
across different tasks. Additionally, we demonstrate several baselines for all
tasks and show that approaches for one task can be applied to another with a
quantifiable and explainable performance difference. Dataset annotations and
evaluation code is available at: https://github.com/Ali2500/BURST-benchmark
Siam R-CNN: Visual Tracking by Re-Detection
We present Siam R-CNN, a Siamese re-detection architecture which unleashes
the full power of two-stage object detection approaches for visual object
tracking. We combine this with a novel tracklet-based dynamic programming
algorithm, which takes advantage of re-detections of both the first-frame
template and previous-frame predictions, to model the full history of both the
object to be tracked and potential distractor objects. This enables our
approach to make better tracking decisions, as well as to re-detect tracked
objects after long occlusion. Finally, we propose a novel hard example mining
strategy to improve Siam R-CNN's robustness to similar looking objects. Siam
R-CNN achieves the current best performance on ten tracking benchmarks, with
especially strong results for long-term tracking. We make our code and models
available at www.vision.rwth-aachen.de/page/siamrcnn.Comment: CVPR 2020 camera-ready versio