258 research outputs found
Fusion of Head and Full-Body Detectors for Multi-Object Tracking
In order to track all persons in a scene, the tracking-by-detection paradigm
has proven to be a very effective approach. Yet, relying solely on a single
detector is also a major limitation, as useful image information might be
ignored. Consequently, this work demonstrates how to fuse two detectors into a
tracking system. To obtain the trajectories, we propose to formulate tracking
as a weighted graph labeling problem, resulting in a binary quadratic program.
As such problems are NP-hard, the solution can only be approximated. Based on
the Frank-Wolfe algorithm, we present a new solver that is crucial to handle
such difficult problems. Evaluation on pedestrian tracking is provided for
multiple scenarios, showing superior results over single detector tracking and
standard QP-solvers. Finally, our tracker ranks 2nd on the MOT16 benchmark and
1st on the new MOT17 benchmark, outperforming over 90 trackers.Comment: 10 pages, 4 figures; Winner of the MOT17 challenge; CVPRW 201
Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking
The most common paradigm for vision-based multi-object tracking is
tracking-by-detection, due to the availability of reliable detectors for
several important object categories such as cars and pedestrians. However,
future mobile systems will need a capability to cope with rich human-made
environments, in which obtaining detectors for every possible object category
would be infeasible. In this paper, we propose a model-free multi-object
tracking approach that uses a category-agnostic image segmentation method to
track objects. We present an efficient segmentation mask-based tracker which
associates pixel-precise masks reported by the segmentation. Our approach can
utilize semantic information whenever it is available for classifying objects
at the track level, while retaining the capability to track generic unknown
objects in the absence of such information. We demonstrate experimentally that
our approach achieves performance comparable to state-of-the-art
tracking-by-detection methods for popular object categories such as cars and
pedestrians. Additionally, we show that the proposed method can discover and
robustly track a large variety of other objects.Comment: ICRA'18 submissio
LMGP: Lifted Multicut Meets Geometry Projections for Multi-Camera Multi-Object Tracking
Multi-Camera Multi-Object Tracking is currently drawing attention in the
computer vision field due to its superior performance in real-world
applications such as video surveillance in crowded scenes or in wide spaces. In
this work, we propose a mathematically elegant multi-camera multiple object
tracking approach based on a spatial-temporal lifted multicut formulation. Our
model utilizes state-of-the-art tracklets produced by single-camera trackers as
proposals. As these tracklets may contain ID-Switch errors, we refine them
through a novel pre-clustering obtained from 3D geometry projections. As a
result, we derive a better tracking graph without ID switches and more precise
affinity costs for the data association phase. Tracklets are then matched to
multi-camera trajectories by solving a global lifted multicut formulation that
incorporates short and long-range temporal interactions on tracklets located in
the same camera as well as inter-camera ones. Experimental results on the
WildTrack dataset yield near-perfect performance, outperforming
state-of-the-art trackers on Campus while being on par on the PETS-09 dataset.Comment: Official version for CVPR 202
Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the Heads
With the rapid development of deep learning, object detection and tracking
play a vital role in today's society. Being able to identify and track all the
pedestrians in the dense crowd scene with computer vision approaches is a
typical challenge in this field, also known as the Multiple Object Tracking
(MOT) challenge. Modern trackers are required to operate on more and more
complicated scenes. According to the MOT20 challenge result, the pedestrian is
4 times denser than the MOT17 challenge. Hence, improving the ability to detect
and track in extremely crowded scenes is the aim of this work. In light of the
occlusion issue with the human body, the heads are usually easier to identify.
In this work, we have designed a joint head and body detector in an anchor-free
style to boost the detection recall and precision performance of pedestrians in
both small and medium sizes. Innovatively, our model does not require
information on the statistical head-body ratio for common pedestrians detection
for training. Instead, the proposed model learns the ratio dynamically. To
verify the effectiveness of the proposed model, we evaluate the model with
extensive experiments on different datasets, including MOT20, Crowdhuman, and
HT21 datasets. As a result, our proposed method significantly improves both the
recall and precision rate on small & medium sized pedestrians and achieves
state-of-the-art results in these challenging datasets.Comment: Accepted at AJCAI 202
- …