34 research outputs found
Pose-guided feature alignment for occluded person re-identification
Β© 2019 IEEE. Persons are often occluded by various obstacles in person retrieval scenarios. Previous person re-identification (re-id) methods, either overlook this issue or resolve it based on an extreme assumption. To alleviate the occlusion problem, we propose to detect the occluded regions, and explicitly exclude those regions during feature generation and matching. In this paper, we introduce a novel method named Pose-Guided Feature Alignment (PGFA), exploiting pose landmarks to disentangle the useful information from the occlusion noise. During the feature constructing stage, our method utilizes human landmarks to generate attention maps. The generated attention maps indicate if a specific body part is occluded and guide our model to attend to the non-occluded regions. During matching, we explicitly partition the global feature into parts and use the pose landmarks to indicate which partial features belonging to the target person. Only the visible regions are utilized for the retrieval. Besides, we construct a large-scale dataset for the Occluded Person Re-ID problem, namely Occluded-DukeMTMC, which is by far the largest dataset for the Occlusion Person Re-ID. Extensive experiments are conducted on our constructed occluded re-id dataset, two partial re-id datasets, and two commonly used holistic re-id datasets. Our method largely outperforms existing person re-id methods on three occlusion datasets, while remains top performance on two holistic datasets
{PoseTrackReID}: {D}ataset Description
Current datasets for video-based person re-identification (re-ID) do not include structural knowledge in form of human pose annotations for the persons of interest. Nonetheless, pose information is very helpful to disentangle useful feature information from background or occlusion noise. Especially real-world scenarios, such as surveillance, contain a lot of occlusions in human crowds or by obstacles. On the other hand, video-based person re-ID can benefit other tasks such as multi-person pose tracking in terms of robust feature matching. For that reason, we present PoseTrackReID, a large-scale dataset for multi-person pose tracking and video-based person re-ID. With PoseTrackReID, we want to bridge the gap between person re-ID and multi-person pose tracking. Additionally, this dataset provides a good benchmark for current state-of-the-art methods on multi-frame person re-ID
ΠΠ»Π³ΠΎΡΠΈΡΠΌ ΡΠ΅ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π»ΡΠ΄Π΅ΠΉ ΠΏΠΎ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡΠΌ ΡΠΈΡΡΠ΅ΠΌ Π²ΠΈΠ΄Π΅ΠΎΠ½Π°Π±Π»ΡΠ΄Π΅Π½ΠΈΡ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π½Π΅ΠΉΡΠΎΡΠ΅ΡΠ΅Π²ΠΎΠ³ΠΎ ΡΠΎΡΡΠ°Π²Π½ΠΎΠ³ΠΎ Π΄Π΅ΡΠΊΡΠΈΠΏΡΠΎΡΠ°
ΠΠ»Ρ ΠΏΠΎΠ²ΡΡΠ΅Π½ΠΈΡ ΡΠΎΡΠ½ΠΎΡΡΠΈ ΡΠ΅ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π»ΡΠ΄Π΅ΠΉ Π² ΡΠ°ΡΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΡ
ΡΠΈΡΡΠ΅ΠΌΠ°Ρ
Π²ΠΈΠ΄Π΅ΠΎΠ½Π°Π±Π»ΡΠ΄Π΅Π½ΠΈΡ Π²Π°ΠΆΠ½ΡΠΌ ΡΠ²Π»ΡΠ΅ΡΡΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π°Π»Π³ΠΎΡΠΈΡΠΌΠ°, ΠΎΠ±Π΅ΡΠΏΠ΅ΡΠΈΠ²Π°ΡΡΠ΅Π³ΠΎ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΡ ΠΏΡΠΈ ΠΏΠ΅ΡΠ΅ΠΊΡΡΡΠΈΠΈ ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ° Π΄ΡΡΠ³ΠΈΠΌΠΈ Π»ΡΠ΄ΡΠΌΠΈ ΠΈΠ»ΠΈ ΠΎΠ±ΡΠ΅ΠΊΡΠ°ΠΌΠΈ. ΠΠΎΡΡΠΎΠΌΡ Π΄Π»Ρ ΡΠ°ΠΊΠΎΠΉ Π·Π°Π΄Π°ΡΠΈ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½ Π°Π»Π³ΠΎΡΠΈΡΠΌ, ΠΏΡΠ΅Π΄ΠΏΠΎΠ»Π°Π³Π°ΡΡΠΈΠΉ ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ ΡΠΎΡΡΠ°Π²Π½ΠΎΠ³ΠΎ Π΄Π΅ΡΠΊΡΠΈΠΏΡΠΎΡΠ°, ΠΊΠΎΡΠΎΡΡΠΉ Π²ΠΊΠ»ΡΡΠ°Π΅Ρ Π³Π»ΠΎΠ±Π°Π»ΡΠ½ΡΠΉ Π²Π΅ΠΊΡΠΎΡ ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ² ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡ ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ° ΠΈ ΡΡΠΈ Π»ΠΎΠΊΠ°Π»ΡΠ½ΡΡ
, Π΄Π»Ρ Π΅Π³ΠΎ Π²Π΅ΡΡ
Π½Π΅ΠΉ, ΡΡΠ΅Π΄Π½Π΅ΠΉ ΠΈ Π½ΠΈΠΆΠ½Π΅ΠΉ ΡΠ°ΡΡΠ΅ΠΉ. ΠΡΠ΄Π΅Π»Π΅Π½ΠΈΠ΅ ΠΎΠ±Π»Π°ΡΡΠ΅ΠΉ ΠΈΠ½ΡΠ΅ΡΠ΅ΡΠ° ΠΎΡΡΡΠ΅ΡΡΠ²Π»ΡΠ΅ΡΡΡ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠΎΠ² ΠΎΠ±Π½Π°ΡΡΠΆΠ΅Π½ΠΈΡ ΠΊΠ»ΡΡΠ΅Π²ΡΡ
ΡΠΎΡΠ΅ΠΊ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡ ΡΠ΅Π»Π° ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ°. ΠΡΠ»ΠΈ ΡΠ°ΡΡΡ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡ ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ° ΠΏΠ΅ΡΠ΅ΠΊΡΡΠ²Π°Π΅ΡΡΡ Π΄ΡΡΠ³ΠΈΠΌΠΈ Π»ΡΠ΄ΡΠΌΠΈ ΠΈΠ»ΠΈ ΠΎΠ±ΡΠ΅ΠΊΡΠ°ΠΌΠΈ, ΡΠΎ ΠΎΠ½Π° ΠΎΡΠ½ΠΎΡΠΈΡΡΡ ΠΊ Π½Π΅Π²ΠΈΠ΄ΠΈΠΌΠΎΠΉ. ΠΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΠ΅ ΡΠΊΡΡΡΠΎΠΉ ΡΠ°ΡΡΠΈ ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ° Π½Π΅ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΡΡΡ Π² ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΠΈ Π»ΠΎΠΊΠ°Π»ΡΠ½ΠΎΠ³ΠΎ ΠΏΡΠΈΠ·Π½Π°ΠΊΠ°. ΠΠ»Ρ Π΅Π³ΠΎ ΠΏΠΎΠ»ΡΡΠ΅Π½ΠΈΡ Π²ΡΡΠΈΡΠ»ΡΠ΅ΡΡΡ ΡΡΡΠ΅Π΄Π½Π΅Π½Π½ΠΎΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ ΡΠ°ΠΊΠΈΡ
ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ² k-Π±Π»ΠΈΠΆΠ°ΠΉΡΠΈΡ
ΡΠΎΡΠ΅Π΄Π΅ΠΉ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡ ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ°. ΠΡΠΏΠΎΠ»Π½Π΅Π½Π½ΡΠ΅ ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΡ ΡΠ²ΠΈΠ΄Π΅ΡΠ΅Π»ΡΡΡΠ²ΡΡΡ ΠΎ ΠΏΠΎΠ²ΡΡΠ΅Π½ΠΈΠΈ ΡΠΎΡΠ½ΠΎΡΡΠΈ ΠΏΠΎΠ²ΡΠΎΡΠ½ΠΎΠΉ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π΄Π»Ρ Π½Π°Π±ΠΎΡΠΎΠ² Π΄Π°Π½Π½ΡΡ
Market-1501, DukeMTMC-ReID, MSMT17 ΠΈ PolReID1077
Part Representation Learning with Teacher-Student Decoder for Occluded Person Re-identification
Occluded person re-identification (ReID) is a very challenging task due to
the occlusion disturbance and incomplete target information. Leveraging
external cues such as human pose or parsing to locate and align part features
has been proven to be very effective in occluded person ReID. Meanwhile, recent
Transformer structures have a strong ability of long-range modeling.
Considering the above facts, we propose a Teacher-Student Decoder (TSD)
framework for occluded person ReID, which utilizes the Transformer decoder with
the help of human parsing. More specifically, our proposed TSD consists of a
Parsing-aware Teacher Decoder (PTD) and a Standard Student Decoder (SSD). PTD
employs human parsing cues to restrict Transformer's attention and imparts this
information to SSD through feature distillation. Thereby, SSD can learn from
PTD to aggregate information of body parts automatically. Moreover, a mask
generator is designed to provide discriminative regions for better ReID. In
addition, existing occluded person ReID benchmarks utilize occluded samples as
queries, which will amplify the role of alleviating occlusion interference and
underestimate the impact of the feature absence issue. Contrastively, we
propose a new benchmark with non-occluded queries, serving as a complement to
the existing benchmark. Extensive experiments demonstrate that our proposed
method is superior and the new benchmark is essential. The source codes are
available at https://github.com/hh23333/TSD.Comment: Accepted by ICASSP202