94,892 research outputs found
Instance Flow Based Online Multiple Object Tracking
We present a method to perform online Multiple Object Tracking (MOT) of known
object categories in monocular video data. Current Tracking-by-Detection MOT
approaches build on top of 2D bounding box detections. In contrast, we exploit
state-of-the-art instance aware semantic segmentation techniques to compute 2D
shape representations of target objects in each frame. We predict position and
shape of segmented instances in subsequent frames by exploiting optical flow
cues. We define an affinity matrix between instances of subsequent frames which
reflects locality and visual similarity. The instance association is solved by
applying the Hungarian method. We evaluate different configurations of our
algorithm using the MOT 2D 2015 train dataset. The evaluation shows that our
tracking approach is able to track objects with high relative motions. In
addition, we provide results of our approach on the MOT 2D 2015 test set for
comparison with previous works. We achieve a MOTA score of 32.1
Siamese Instance Search for Tracking
In this paper we present a tracker, which is radically different from
state-of-the-art trackers: we apply no model updating, no occlusion detection,
no combination of trackers, no geometric matching, and still deliver
state-of-the-art tracking performance, as demonstrated on the popular online
tracking benchmark (OTB) and six very challenging YouTube videos. The presented
tracker simply matches the initial patch of the target in the first frame with
candidates in a new frame and returns the most similar patch by a learned
matching function. The strength of the matching function comes from being
extensively trained generically, i.e., without any data of the target, using a
Siamese deep neural network, which we design for tracking. Once learned, the
matching function is used as is, without any adapting, to track previously
unseen targets. It turns out that the learned matching function is so powerful
that a simple tracker built upon it, coined Siamese INstance search Tracker,
SINT, which only uses the original observation of the target from the first
frame, suffices to reach state-of-the-art performance. Further, we show the
proposed tracker even allows for target re-identification after the target was
absent for a complete video shot.Comment: This paper is accepted to the IEEE Conference on Computer Vision and
Pattern Recognition, 201
Online Domain Adaptation for Multi-Object Tracking
Automatically detecting, labeling, and tracking objects in videos depends
first and foremost on accurate category-level object detectors. These might,
however, not always be available in practice, as acquiring high-quality large
scale labeled training datasets is either too costly or impractical for all
possible real-world application scenarios. A scalable solution consists in
re-using object detectors pre-trained on generic datasets. This work is the
first to investigate the problem of on-line domain adaptation of object
detectors for causal multi-object tracking (MOT). We propose to alleviate the
dataset bias by adapting detectors from category to instances, and back: (i) we
jointly learn all target models by adapting them from the pre-trained one, and
(ii) we also adapt the pre-trained model on-line. We introduce an on-line
multi-task learning algorithm to efficiently share parameters and reduce drift,
while gradually improving recall. Our approach is applicable to any linear
object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive
"off-the-shelf" ConvNet features. We quantitatively measure the benefit of our
domain adaptation strategy on the KITTI tracking benchmark and on a new dataset
(PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.Comment: To appear at BMVC 201
Efficient Asymmetric Co-Tracking using Uncertainty Sampling
Adaptive tracking-by-detection approaches are popular for tracking arbitrary
objects. They treat the tracking problem as a classification task and use
online learning techniques to update the object model. However, these
approaches are heavily invested in the efficiency and effectiveness of their
detectors. Evaluating a massive number of samples for each frame (e.g.,
obtained by a sliding window) forces the detector to trade the accuracy in
favor of speed. Furthermore, misclassification of borderline samples in the
detector introduce accumulating errors in tracking. In this study, we propose a
co-tracking based on the efficient cooperation of two detectors: a rapid
adaptive exemplar-based detector and another more sophisticated but slower
detector with a long-term memory. The sampling labeling and co-learning of the
detectors are conducted by an uncertainty sampling unit, which improves the
speed and accuracy of the system. We also introduce a budgeting mechanism which
prevents the unbounded growth in the number of examples in the first detector
to maintain its rapid response. Experiments demonstrate the efficiency and
effectiveness of the proposed tracker against its baselines and its superior
performance against state-of-the-art trackers on various benchmark videos.Comment: Submitted to IEEE ICSIPA'201
Deep Network Flow for Multi-Object Tracking
Data association problems are an important component of many computer vision
applications, with multi-object tracking being one of the most prominent
examples. A typical approach to data association involves finding a graph
matching or network flow that minimizes a sum of pairwise association costs,
which are often either hand-crafted or learned as linear functions of fixed
features. In this work, we demonstrate that it is possible to learn features
for network-flow-based data association via backpropagation, by expressing the
optimum of a smoothed network flow problem as a differentiable function of the
pairwise association costs. We apply this approach to multi-object tracking
with a network flow formulation. Our experiments demonstrate that we are able
to successfully learn all cost functions for the association problem in an
end-to-end fashion, which outperform hand-crafted costs in all settings. The
integration and combination of various sources of inputs becomes easy and the
cost functions can be learned entirely from data, alleviating tedious
hand-designing of costs.Comment: Accepted to CVPR 201
- …