45,452 research outputs found

    Pose Embeddings: A Deep Architecture for Learning to Match Human Poses

    Full text link
    We present a method for learning an embedding that places images of humans in similar poses nearby. This embedding can be used as a direct method of comparing images based on human pose, avoiding potential challenges of estimating body joint positions. Pose embedding learning is formulated under a triplet-based distance criterion. A deep architecture is used to allow learning of a representation capable of making distinctions between different poses. Experiments on human pose matching and retrieval from video data demonstrate the potential of the method

    A Survey on Joint Object Detection and Pose Estimation using Monocular Vision

    Get PDF
    In this survey we present a complete landscape of joint object detection and pose estimation methods that use monocular vision. Descriptions of traditional approaches that involve descriptors or models and various estimation methods have been provided. These descriptors or models include chordiograms, shape-aware deformable parts model, bag of boundaries, distance transform templates, natural 3D markers and facet features whereas the estimation methods include iterative clustering estimation, probabilistic networks and iterative genetic matching. Hybrid approaches that use handcrafted feature extraction followed by estimation by deep learning methods have been outlined. We have investigated and compared, wherever possible, pure deep learning based approaches (single stage and multi stage) for this problem. Comprehensive details of the various accuracy measures and metrics have been illustrated. For the purpose of giving a clear overview, the characteristics of relevant datasets are discussed. The trends that prevailed from the infancy of this problem until now have also been highlighted.Comment: Accepted at the International Joint Conference on Computer Vision and Pattern Recognition (CCVPR) 201

    LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching

    Full text link
    The local reference frame (LRF) acts as a critical role in 3D local shape description and matching. However, most of existing LRFs are hand-crafted and suffer from limited repeatability and robustness. This paper presents the first attempt to learn an LRF via a Siamese network that needs weak supervision only. In particular, we argue that each neighboring point in the local surface gives a unique contribution to LRF construction and measure such contributions via learned weights. Extensive analysis and comparative experiments on three public datasets addressing different application scenarios have demonstrated that LRF-Net is more repeatable and robust than several state-of-the-art LRF methods (LRF-Net is only trained on one dataset). In addition, LRF-Net can significantly boost the local shape description and 6-DoF pose estimation performance when matching 3D point clouds.Comment: 28 pages, 14 figure
    • …
    corecore