194 research outputs found

    Turbo Learning Framework for Human-Object Interactions Recognition and Human Pose Estimation

    Full text link
    Human-object interactions (HOI) recognition and pose estimation are two closely related tasks. Human pose is an essential cue for recognizing actions and localizing the interacted objects. Meanwhile, human action and their interacted objects' localizations provide guidance for pose estimation. In this paper, we propose a turbo learning framework to perform HOI recognition and pose estimation simultaneously. First, two modules are designed to enforce message passing between the tasks, i.e. pose aware HOI recognition module and HOI guided pose estimation module. Then, these two modules form a closed loop to utilize the complementary information iteratively, which can be trained in an end-to-end manner. The proposed method achieves the state-of-the-art performance on two public benchmarks including Verbs in COCO (V-COCO) and HICO-DET datasets.Comment: AAAI201

    PoNA: Pose-guided non-local attention for human pose transfer

    Get PDF
    Human pose transfer, which aims at transferring the appearance of a given person to a target pose, is very challenging and important in many applications. Previous work ignores the guidance of pose features or only uses local attention mechanism, leading to implausible and blurry results. We propose a new human pose transfer method using a generative adversarial network (GAN) with simplified cascaded blocks. In each block, we propose a pose-guided non-local attention (PoNA) mechanism with a long-range dependency scheme to select more important regions of image features to transfer. We also design pre-posed image-guided pose feature update and post-posed pose-guided image feature update to better utilize the pose and image features. Our network is simple, stable, and easy to train. Quantitative and qualitative results on Market-1501 and DeepFashion datasets show the efficacy and efficiency of our model. Compared with state-of-the-art methods, our model generates sharper and more realistic images with rich details, while having fewer parameters and faster speed. Furthermore, our generated images can help to alleviate data insufficiency for person re-identification
    • …
    corecore