2,403 research outputs found
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
UniHCP: A Unified Model for Human-Centric Perceptions
Human-centric perceptions (e.g., pose estimation, human parsing, pedestrian
detection, person re-identification, etc.) play a key role in industrial
applications of visual models. While specific human-centric tasks have their
own relevant semantic aspect to focus on, they also share the same underlying
semantic structure of the human body. However, few works have attempted to
exploit such homogeneity and design a general-propose model for human-centric
tasks. In this work, we revisit a broad range of human-centric tasks and unify
them in a minimalist manner. We propose UniHCP, a Unified Model for
Human-Centric Perceptions, which unifies a wide range of human-centric tasks in
a simplified end-to-end manner with the plain vision transformer architecture.
With large-scale joint training on 33 human-centric datasets, UniHCP can
outperform strong baselines on several in-domain and downstream tasks by direct
evaluation. When adapted to a specific task, UniHCP achieves new SOTAs on a
wide range of human-centric tasks, e.g., 69.8 mIoU on CIHP for human parsing,
86.18 mA on PA-100K for attribute prediction, 90.3 mAP on Market1501 for ReID,
and 85.8 JI on CrowdHuman for pedestrian detection, performing better than
specialized models tailored for each task.Comment: Accepted for publication at the IEEE/CVF Conference on Computer
Vision and Pattern Recognition 2023 (CVPR 2023
HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining
Human-centric perceptions include a variety of vision tasks, which have
widespread industrial applications, including surveillance, autonomous driving,
and the metaverse. It is desirable to have a general pretrain model for
versatile human-centric downstream tasks. This paper forges ahead along this
path from the aspects of both benchmark and pretraining methods. Specifically,
we propose a \textbf{HumanBench} based on existing datasets to comprehensively
evaluate on the common ground the generalization abilities of different
pretraining methods on 19 datasets from 6 diverse downstream tasks, including
person ReID, pose estimation, human parsing, pedestrian attribute recognition,
pedestrian detection, and crowd counting. To learn both coarse-grained and
fine-grained knowledge in human bodies, we further propose a \textbf{P}rojector
\textbf{A}ssis\textbf{T}ed \textbf{H}ierarchical pretraining method
(\textbf{PATH}) to learn diverse knowledge at different granularity levels.
Comprehensive evaluations on HumanBench show that our PATH achieves new
state-of-the-art results on 17 downstream datasets and on-par results on the
other 2 datasets. The code will be publicly at
\href{https://github.com/OpenGVLab/HumanBench}{https://github.com/OpenGVLab/HumanBench}.Comment: Accepted to CVPR202
- …