3,521 research outputs found
LSTA: Long Short-Term Attention for Egocentric Action Recognition
Egocentric activity recognition is one of the most challenging tasks in video
analysis. It requires a fine-grained discrimination of small objects and their
manipulation. While some methods base on strong supervision and attention
mechanisms, they are either annotation consuming or do not take spatio-temporal
patterns into account. In this paper we propose LSTA as a mechanism to focus on
features from spatial relevant parts while attention is being tracked smoothly
across the video sequence. We demonstrate the effectiveness of LSTA on
egocentric activity recognition with an end-to-end trainable two-stream
architecture, achieving state of the art performance on four standard
benchmarks.Comment: Accepted to CVPR 201
On human motion prediction using recurrent neural networks
Human motion modelling is a classical problem at the intersection of graphics
and computer vision, with applications spanning human-computer interaction,
motion synthesis, and motion prediction for virtual and augmented reality.
Following the success of deep learning methods in several computer vision
tasks, recent work has focused on using deep recurrent neural networks (RNNs)
to model human motion, with the goal of learning time-dependent representations
that perform tasks such as short-term motion prediction and long-term human
motion synthesis. We examine recent work, with a focus on the evaluation
methodologies commonly used in the literature, and show that, surprisingly,
state-of-the-art performance can be achieved by a simple baseline that does not
attempt to model motion at all. We investigate this result, and analyze recent
RNN methods by looking at the architectures, loss functions, and training
procedures used in state-of-the-art approaches. We propose three changes to the
standard RNN models typically used for human motion, which result in a simple
and scalable RNN architecture that obtains state-of-the-art performance on
human motion prediction.Comment: Accepted at CVPR 1
Practical classification of different moving targets using automotive radar and deep neural networks
In this work, the authors present results for classification of different classes of targets (car, single and multiple people, bicycle) using automotive radar data and different neural networks. A fast implementation of radar algorithms for detection, tracking, and micro-Doppler extraction is proposed in conjunction with the automotive radar transceiver TEF810X and microcontroller unit SR32R274 manufactured by NXP Semiconductors. Three different types of neural networks are considered, namely a classic convolutional network, a residual network, and a combination of convolutional and recurrent network, for different classification problems across the four classes of targets recorded. Considerable accuracy (close to 100% in some cases) and low latency of the radar pre-processing prior to classification (∼0.55 s to produce a 0.5 s long spectrogram) are demonstrated in this study, and possible shortcomings and outstanding issues are discussed
DeepSignals: Predicting Intent of Drivers Through Visual Signals
Detecting the intention of drivers is an essential task in self-driving,
necessary to anticipate sudden events like lane changes and stops. Turn signals
and emergency flashers communicate such intentions, providing seconds of
potentially critical reaction time. In this paper, we propose to detect these
signals in video sequences by using a deep neural network that reasons about
both spatial and temporal information. Our experiments on more than a million
frames show high per-frame accuracy in very challenging scenarios.Comment: To be presented at the IEEE International Conference on Robotics and
Automation (ICRA), 201
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …