2,810 research outputs found
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
Semantic-Aware Scene Recognition
Scene recognition is currently one of the top-challenging research fields in
computer vision. This may be due to the ambiguity between classes: images of
several scene classes may share similar objects, which causes confusion among
them. The problem is aggravated when images of a particular scene class are
notably different. Convolutional Neural Networks (CNNs) have significantly
boosted performance in scene recognition, albeit it is still far below from
other recognition tasks (e.g., object or image recognition). In this paper, we
describe a novel approach for scene recognition based on an end-to-end
multi-modal CNN that combines image and context information by means of an
attention module. Context information, in the shape of semantic segmentation,
is used to gate features extracted from the RGB image by leveraging on
information encoded in the semantic representation: the set of scene objects
and stuff, and their relative locations. This gating process reinforces the
learning of indicative scene content and enhances scene disambiguation by
refocusing the receptive fields of the CNN towards them. Experimental results
on four publicly available datasets show that the proposed approach outperforms
every other state-of-the-art method while significantly reducing the number of
network parameters. All the code and data used along this paper is available at
https://github.com/vpulab/Semantic-Aware-Scene-RecognitionComment: Paper submitted for publication to Elsevier Pattern Recognition
journa
- …