120,686 research outputs found
Long-term Tracking in the Wild: A Benchmark
We introduce the OxUvA dataset and benchmark for evaluating single-object
tracking algorithms. Benchmarks have enabled great strides in the field of
object tracking by defining standardized evaluations on large sets of diverse
videos. However, these works have focused exclusively on sequences that are
just tens of seconds in length and in which the target is always visible.
Consequently, most researchers have designed methods tailored to this
"short-term" scenario, which is poorly representative of practitioners' needs.
Aiming to address this disparity, we compile a long-term, large-scale tracking
dataset of sequences with average length greater than two minutes and with
frequent target object disappearance. The OxUvA dataset is much larger than the
object tracking datasets of recent years: it comprises 366 sequences spanning
14 hours of video. We assess the performance of several algorithms, considering
both the ability to locate the target and to determine whether it is present or
absent. Our goal is to offer the community a large and diverse benchmark to
enable the design and evaluation of tracking methods ready to be used "in the
wild". The project website is http://oxuva.netComment: To appear at ECCV 201
Joint Detection and Tracking in Videos with Identification Features
Recent works have shown that combining object detection and tracking tasks,
in the case of video data, results in higher performance for both tasks, but
they require a high frame-rate as a strict requirement for performance. This is
assumption is often violated in real-world applications, when models run on
embedded devices, often at only a few frames per second.
Videos at low frame-rate suffer from large object displacements. Here
re-identification features may support to match large-displaced object
detections, but current joint detection and re-identification formulations
degrade the detector performance, as these two are contrasting tasks. In the
real-world application having separate detector and re-id models is often not
feasible, as both the memory and runtime effectively double.
Towards robust long-term tracking applicable to reduced-computational-power
devices, we propose the first joint optimization of detection, tracking and
re-identification features for videos. Notably, our joint optimization
maintains the detector performance, a typical multi-task challenge. At
inference time, we leverage detections for tracking (tracking-by-detection)
when the objects are visible, detectable and slowly moving in the image. We
leverage instead re-identification features to match objects which disappeared
(e.g. due to occlusion) for several frames or were not tracked due to fast
motion (or low-frame-rate videos). Our proposed method reaches the
state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge
among online trackers, and 3rd overall.Comment: Accepted at Image and Vision Computing Journa
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
- …