25,442 research outputs found
Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation
Visual tracking is a fundamental problem in computer vision. Recently, some
deep-learning-based tracking algorithms have been achieving record-breaking
performances. However, due to the high complexity of deep learning, most deep
trackers suffer from low tracking speed, and thus are impractical in many
real-world applications. Some new deep trackers with smaller network structure
achieve high efficiency while at the cost of significant decrease on precision.
In this paper, we propose to transfer the feature for image classification to
the visual tracking domain via convolutional channel reductions. The channel
reduction could be simply viewed as an additional convolutional layer with the
specific task. It not only extracts useful information for object tracking but
also significantly increases the tracking speed. To better accommodate the
useful feature of the target in different scales, the adaptation filters are
designed with different sizes. The yielded visual tracker is real-time and also
illustrates the state-of-the-art accuracies in the experiment involving two
well-adopted benchmarks with more than 100 test videos.Comment: 6 page
Learning feed-forward one-shot learners
One-shot learning is usually tackled by using generative models or
discriminative embeddings. Discriminative methods based on deep learning, which
are very effective in other learning scenarios, are ill-suited for one-shot
learning as they need large amounts of training data. In this paper, we propose
a method to learn the parameters of a deep model in one shot. We construct the
learner as a second deep network, called a learnet, which predicts the
parameters of a pupil network from a single exemplar. In this manner we obtain
an efficient feed-forward one-shot learner, trained end-to-end by minimizing a
one-shot classification objective in a learning to learn formulation. In order
to make the construction feasible, we propose a number of factorizations of the
parameters of the pupil network. We demonstrate encouraging results by learning
characters from single exemplars in Omniglot, and by tracking visual objects
from a single initial exemplar in the Visual Object Tracking benchmark.Comment: The first three authors contributed equally, and are listed in
alphabetical orde
Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks
This paper presents to the best of our knowledge the first end-to-end object
tracking approach which directly maps from raw sensor input to object tracks in
sensor space without requiring any feature engineering or system identification
in the form of plant or sensor models. Specifically, our system accepts a
stream of raw sensor data at one end and, in real-time, produces an estimate of
the entire environment state at the output including even occluded objects. We
achieve this by framing the problem as a deep learning task and exploit
sequence models in the form of recurrent neural networks to learn a mapping
from sensor measurements to object tracks. In particular, we propose a learning
method based on a form of input dropout which allows learning in an
unsupervised manner, only based on raw, occluded sensor data without access to
ground-truth annotations. We demonstrate our approach using a synthetic dataset
designed to mimic the task of tracking objects in 2D laser data -- as commonly
encountered in robotics applications -- and show that it learns to track many
dynamic objects despite occlusions and the presence of sensor noise.Comment: Published in The Thirtieth AAAI Conference on Artificial Intelligence
(AAAI-16), Video: https://youtu.be/cdeWCpfUGWc, Code:
http://mrg.robots.ox.ac.uk/mrg_people/peter-ondruska
- …