45,452 research outputs found
Pose Embeddings: A Deep Architecture for Learning to Match Human Poses
We present a method for learning an embedding that places images of humans in
similar poses nearby. This embedding can be used as a direct method of
comparing images based on human pose, avoiding potential challenges of
estimating body joint positions. Pose embedding learning is formulated under a
triplet-based distance criterion. A deep architecture is used to allow learning
of a representation capable of making distinctions between different poses.
Experiments on human pose matching and retrieval from video data demonstrate
the potential of the method
A Survey on Joint Object Detection and Pose Estimation using Monocular Vision
In this survey we present a complete landscape of joint object detection and
pose estimation methods that use monocular vision. Descriptions of traditional
approaches that involve descriptors or models and various estimation methods
have been provided. These descriptors or models include chordiograms,
shape-aware deformable parts model, bag of boundaries, distance transform
templates, natural 3D markers and facet features whereas the estimation methods
include iterative clustering estimation, probabilistic networks and iterative
genetic matching. Hybrid approaches that use handcrafted feature extraction
followed by estimation by deep learning methods have been outlined. We have
investigated and compared, wherever possible, pure deep learning based
approaches (single stage and multi stage) for this problem. Comprehensive
details of the various accuracy measures and metrics have been illustrated. For
the purpose of giving a clear overview, the characteristics of relevant
datasets are discussed. The trends that prevailed from the infancy of this
problem until now have also been highlighted.Comment: Accepted at the International Joint Conference on Computer Vision and
Pattern Recognition (CCVPR) 201
LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching
The local reference frame (LRF) acts as a critical role in 3D local shape
description and matching. However, most of existing LRFs are hand-crafted and
suffer from limited repeatability and robustness. This paper presents the first
attempt to learn an LRF via a Siamese network that needs weak supervision only.
In particular, we argue that each neighboring point in the local surface gives
a unique contribution to LRF construction and measure such contributions via
learned weights. Extensive analysis and comparative experiments on three public
datasets addressing different application scenarios have demonstrated that
LRF-Net is more repeatable and robust than several state-of-the-art LRF methods
(LRF-Net is only trained on one dataset). In addition, LRF-Net can
significantly boost the local shape description and 6-DoF pose estimation
performance when matching 3D point clouds.Comment: 28 pages, 14 figure
- …