22 research outputs found
On Pairwise Costs for Network Flow Multi-Object Tracking
Multi-object tracking has been recently approached with the min-cost network
flow optimization techniques. Such methods simultaneously resolve multiple
object tracks in a video and enable modeling of dependencies among tracks.
Min-cost network flow methods also fit well within the "tracking-by-detection"
paradigm where object trajectories are obtained by connecting per-frame outputs
of an object detector. Object detectors, however, often fail due to occlusions
and clutter in the video. To cope with such situations, we propose to add
pairwise costs to the min-cost network flow framework. While integer solutions
to such a problem become NP-hard, we design a convex relaxation solution with
an efficient rounding heuristic which empirically gives certificates of small
suboptimality. We evaluate two particular types of pairwise costs and
demonstrate improvements over recent tracking methods in real-world video
sequences
Convergence Rate of Frank-Wolfe for Non-Convex Objectives
We give a simple proof that the Frank-Wolfe algorithm obtains a stationary
point at a rate of on non-convex objectives with a Lipschitz
continuous gradient. Our analysis is affine invariant and is the first, to the
best of our knowledge, giving a similar rate to what was already proven for
projected gradient methods (though on slightly different measures of
stationarity).Comment: 6 page
Constrained multi-target tracking for team sports activities
Abstract In sports analysis, player tracking is essential to the extraction of statistics such as speed, distance and direction of motion. Simultaneous tracking of multiple people is still a very challenging computer vision problem to which there is no satisfactory solution. This is especially true for sports activities, for which people often wear similar uniforms, move quickly and erratically, and have close interactions with each other. In this paper, we introduce a multi-target tracking algorithm suitable for team sports activities. We extend an existing algorithm by including an automatic estimation of the occupancy of the observed field and the duration of stable periods without people entering or leaving the field. This information is included as a constraint to the existing offline tracking algorithm in order to construct more reliable trajectories. On data from two challenging sports scenarios—an indoor soccer game captured with thermal cameras and an outdoor soccer training session captured with RGB camera—we show that the tracking performance is improved on all sequences. Compared to the original offline tracking algorithm, we obtain improvements of 3–7% in accuracy. Furthermore, the method outperforms two state-of-the-art trackers
Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts
Multiple Object Tracking (MOT) is a long-standing task in computer vision.
Current approaches based on the tracking by detection paradigm either require
some sort of domain knowledge or supervision to associate data correctly into
tracks. In this work, we present an unsupervised multiple object tracking
approach based on visual features and minimum cost lifted multicuts. Our method
is based on straight-forward spatio-temporal cues that can be extracted from
neighboring frames in an image sequences without superivison. Clustering based
on these cues enables us to learn the required appearance invariances for the
tracking task at hand and train an autoencoder to generate suitable latent
representation. Thus, the resulting latent representations can serve as robust
appearance cues for tracking even over large temporal distances where no
reliable spatio-temporal features could be extracted. We show that, despite
being trained without using the provided annotations, our model provides
competitive results on the challenging MOT Benchmark for pedestrian tracking