1,705 research outputs found
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps
Grid maps are widely used in robotics to represent obstacles in the
environment and differentiating dynamic objects from static infrastructure is
essential for many practical applications. In this work, we present a methods
that uses a deep convolutional neural network (CNN) to infer whether grid cells
are covering a moving object or not. Compared to tracking approaches, that use
e.g. a particle filter to estimate grid cell velocities and then make a
decision for individual grid cells based on this estimate, our approach uses
the entire grid map as input image for a CNN that inspects a larger area around
each cell and thus takes the structural appearance in the grid map into account
to make a decision. Compared to our reference method, our concept yields a
performance increase from 83.9% to 97.2%. A runtime optimized version of our
approach yields similar improvements with an execution time of just 10
milliseconds.Comment: This is a shorter version of the masters thesis of Florian Piewak and
it was accapted at IV 201
Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues
We explore the task of recognizing peoples' identities in photo albums in an
unconstrained setting. To facilitate this, we introduce the new People In Photo
Albums (PIPA) dataset, consisting of over 60000 instances of 2000 individuals
collected from public Flickr photo albums. With only about half of the person
images containing a frontal face, the recognition task is very challenging due
to the large variations in pose, clothing, camera viewpoint, image resolution
and illumination. We propose the Pose Invariant PErson Recognition (PIPER)
method, which accumulates the cues of poselet-level person recognizers trained
by deep convolutional networks to discount for the pose variations, combined
with a face recognizer and a global recognizer. Experiments on three different
settings confirm that in our unconstrained setup PIPER significantly improves
on the performance of DeepFace, which is one of the best face recognizers as
measured on the LFW dataset
- …