8,650 research outputs found
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
Multi-shot Pedestrian Re-identification via Sequential Decision Making
Multi-shot pedestrian re-identification problem is at the core of
surveillance video analysis. It matches two tracks of pedestrians from
different cameras. In contrary to existing works that aggregate single frames
features by time series model such as recurrent neural network, in this paper,
we propose an interpretable reinforcement learning based approach to this
problem. Particularly, we train an agent to verify a pair of images at each
time. The agent could choose to output the result (same or different) or
request another pair of images to verify (unsure). By this way, our model
implicitly learns the difficulty of image pairs, and postpone the decision when
the model does not accumulate enough evidence. Moreover, by adjusting the
reward for unsure action, we can easily trade off between speed and accuracy.
In three open benchmarks, our method are competitive with the state-of-the-art
methods while only using 3% to 6% images. These promising results demonstrate
that our method is favorable in both efficiency and performance
- …