194 research outputs found
Turbo Learning Framework for Human-Object Interactions Recognition and Human Pose Estimation
Human-object interactions (HOI) recognition and pose estimation are two
closely related tasks. Human pose is an essential cue for recognizing actions
and localizing the interacted objects. Meanwhile, human action and their
interacted objects' localizations provide guidance for pose estimation. In this
paper, we propose a turbo learning framework to perform HOI recognition and
pose estimation simultaneously. First, two modules are designed to enforce
message passing between the tasks, i.e. pose aware HOI recognition module and
HOI guided pose estimation module. Then, these two modules form a closed loop
to utilize the complementary information iteratively, which can be trained in
an end-to-end manner. The proposed method achieves the state-of-the-art
performance on two public benchmarks including Verbs in COCO (V-COCO) and
HICO-DET datasets.Comment: AAAI201
PoNA: Pose-guided non-local attention for human pose transfer
Human pose transfer, which aims at transferring the appearance of a given person to a target pose, is very challenging and important in many applications. Previous work ignores the guidance of pose features or only uses local attention mechanism, leading to implausible and blurry results. We propose a new human pose transfer method using a generative adversarial network (GAN) with simplified cascaded blocks. In each block, we propose a pose-guided non-local attention (PoNA) mechanism with a long-range dependency scheme to select more important regions of image features to transfer. We also design pre-posed image-guided pose feature update and post-posed pose-guided image feature update to better utilize the pose and image features. Our network is simple, stable, and easy to train. Quantitative and qualitative results on Market-1501 and DeepFashion datasets show the efficacy and efficiency of our model. Compared with state-of-the-art methods, our model generates sharper and more realistic images with rich details, while having fewer parameters and faster speed. Furthermore, our generated images can help to alleviate data insufficiency for person re-identification
- …