15,719 research outputs found
Mobile Video Object Detection with Temporally-Aware Feature Maps
This paper introduces an online model for object detection in videos designed
to run in real-time on low-powered mobile and embedded devices. Our approach
combines fast single-image object detection with convolutional long short term
memory (LSTM) layers to create an interweaved recurrent-convolutional
architecture. Additionally, we propose an efficient Bottleneck-LSTM layer that
significantly reduces computational cost compared to regular LSTMs. Our network
achieves temporal awareness by using Bottleneck-LSTMs to refine and propagate
feature maps across frames. This approach is substantially faster than existing
detection methods in video, outperforming the fastest single-frame models in
model size and computational cost while attaining accuracy comparable to much
more expensive single-frame models on the Imagenet VID 2015 dataset. Our model
reaches a real-time inference speed of up to 15 FPS on a mobile CPU.Comment: In CVPR 201
Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Tracking
With efficient appearance learning models, Discriminative Correlation Filter
(DCF) has been proven to be very successful in recent video object tracking
benchmarks and competitions. However, the existing DCF paradigm suffers from
two major issues, i.e., spatial boundary effect and temporal filter
degradation. To mitigate these challenges, we propose a new DCF-based tracking
method. The key innovations of the proposed method include adaptive spatial
feature selection and temporal consistent constraints, with which the new
tracker enables joint spatial-temporal filter learning in a lower dimensional
discriminative manifold. More specifically, we apply structured spatial
sparsity constraints to multi-channel filers. Consequently, the process of
learning spatial filters can be approximated by the lasso regularisation. To
encourage temporal consistency, the filter model is restricted to lie around
its historical value and updated locally to preserve the global structure in
the manifold. Last, a unified optimisation framework is proposed to jointly
select temporal consistency preserving spatial features and learn
discriminative filters with the augmented Lagrangian method. Qualitative and
quantitative evaluations have been conducted on a number of well-known
benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123 and
VOT2018. The experimental results demonstrate the superiority of the proposed
method over the state-of-the-art approaches
Teacher-Students Knowledge Distillation for Siamese Trackers
In recent years, Siamese network based trackers have significantly advanced
the state-of-the-art in real-time tracking. However, state-of-the-art Siamese
trackers suffer from high memory cost which restricts their applicability in
mobile applications having strict constraints on memory budget. To address this
issue, we propose a novel distilled Siamese tracking framework to learn small,
fast yet accurate trackers (students), which capture critical knowledge from
large Siamese trackers (teachers) by a teacher-students knowledge distillation
model. This model is intuitively inspired by a one-teacher vs multi-students
learning mechanism, which is the most usual teaching method in the school. In
particular, it contains a single teacher-student distillation model and a
student-student knowledge sharing mechanism. The first one is designed by a
tracking-specific distillation strategy to transfer knowledge from the teacher
to students. The later is utilized for mutual learning between students to
enable an in-depth knowledge understanding. To the best of our knowledge, we
are the first to investigate knowledge distillation for Siamese trackers and
propose a distilled Siamese tracking framework. We demonstrate the generality
and effectiveness of our framework by conducting a theoretical analysis and
extensive empirical evaluations on several popular Siamese trackers. The
results on five tracking benchmarks clearly show that the proposed distilled
trackers achieve compression rates up to 18 and frame-rates of
FPS with speedups of 3, while obtaining similar or even slightly
improved tracking accuracy
Multi-Branch Siamese Networks with Online Selection for Object Tracking
In this paper, we propose a robust object tracking algorithm based on a
branch selection mechanism to choose the most efficient object representations
from multi-branch siamese networks. While most deep learning trackers use a
single CNN for target representation, the proposed Multi-Branch Siamese Tracker
(MBST) employs multiple branches of CNNs pre-trained for different tasks, and
used for various target representations in our tracking method. With our branch
selection mechanism, the appropriate CNN branch is selected depending on the
target characteristics in an online manner. By using the most adequate target
representation with respect to the tracked object, our method achieves
real-time tracking, while obtaining improved performance compared to standard
Siamese network trackers on object tracking benchmarks.Comment: ISVC2018, oral presentatio
- …