30,636 research outputs found
Compressed Video Action Recognition
Training robust deep video representations has proven to be much more
challenging than learning deep image representations. This is in part due to
the enormous size of raw video streams and the high temporal redundancy; the
true and interesting signal is often drowned in too much irrelevant data.
Motivated by that the superfluous information can be reduced by up to two
orders of magnitude by video compression (using H.264, HEVC, etc.), we propose
to train a deep network directly on the compressed video.
This representation has a higher information density, and we found the
training to be easier. In addition, the signals in a compressed video provide
free, albeit noisy, motion information. We propose novel techniques to use them
effectively. Our approach is about 4.6 times faster than Res3D and 2.7 times
faster than ResNet-152. On the task of action recognition, our approach
outperforms all the other methods on the UCF-101, HMDB-51, and Charades
dataset.Comment: CVPR 2018 (Selected for spotlight presentation
Long-term Tracking in the Wild: A Benchmark
We introduce the OxUvA dataset and benchmark for evaluating single-object
tracking algorithms. Benchmarks have enabled great strides in the field of
object tracking by defining standardized evaluations on large sets of diverse
videos. However, these works have focused exclusively on sequences that are
just tens of seconds in length and in which the target is always visible.
Consequently, most researchers have designed methods tailored to this
"short-term" scenario, which is poorly representative of practitioners' needs.
Aiming to address this disparity, we compile a long-term, large-scale tracking
dataset of sequences with average length greater than two minutes and with
frequent target object disappearance. The OxUvA dataset is much larger than the
object tracking datasets of recent years: it comprises 366 sequences spanning
14 hours of video. We assess the performance of several algorithms, considering
both the ability to locate the target and to determine whether it is present or
absent. Our goal is to offer the community a large and diverse benchmark to
enable the design and evaluation of tracking methods ready to be used "in the
wild". The project website is http://oxuva.netComment: To appear at ECCV 201
- …