2,611 research outputs found
Latent variable pictorial structure for human pose estimation on depth images
Prior models of human pose play a key role in state-of-the-art techniques for monocular pose estimation. However, a simple Gaussian model cannot represent well the prior knowledge of the pose diversity on depth images. In this paper, we develop a latent variable-based prior model by introducing a latent variable into the general pictorial structure. Two key characteristics of our model (we call Latent Variable Pictorial Structure) are as follows: (1) it adaptively adopts prior pose models based on the estimated
value of the latent variable; and (2) it enables the learning of a more accurate part classifier. Experimental
results demonstrate that the proposed method outperforms other state-of-the-art methods in recognition rate on the public datasets
Human Pose Estimation from Monocular Images : a Comprehensive Survey
Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation can be found in the literature, but they focus on a certain category; for example, model-based approaches or human motion analysis, etc. As far as we know, an overall review of this problem domain has yet to be provided. Furthermore, recent advancements based on deep learning have brought novel algorithms for this problem. In this paper, a comprehensive survey of human pose estimation from monocular images is carried out including milestone works and recent advancements. Based on one standard pipeline for the solution of computer vision problems, this survey splits the problema into several modules: feature extraction and description, human body models, and modelin methods. Problem modeling methods are approached based on two means of categorization in this survey. One way to categorize includes top-down and bottom-up methods, and another way includes generative and discriminative methods. Considering the fact that one direct application of human pose estimation is to provide initialization for automatic video surveillance, there are additional sections for motion-related methods in all modules: motion features, motion models, and motion-based methods. Finally, the paper also collects 26 publicly available data sets for validation and provides error measurement methods that are frequently used
Learning Human Pose Estimation Features with Convolutional Networks
This paper introduces a new architecture for human pose estimation using a
multi- layer convolutional network architecture and a modified learning
technique that learns low-level features and higher-level weak spatial models.
Unconstrained human pose estimation is one of the hardest problems in computer
vision, and our new architecture and learning schema shows significant
improvement over the current state-of-the-art results. The main contribution of
this paper is showing, for the first time, that a specific variation of deep
learning is able to outperform all existing traditional architectures on this
task. The paper also discusses several lessons learned while researching
alternatives, most notably, that it is possible to learn strong low-level
feature detectors on features that might even just cover a few pixels in the
image. Higher-level spatial models improve somewhat the overall result, but to
a much lesser extent then expected. Many researchers previously argued that the
kinematic structure and top-down information is crucial for this domain, but
with our purely bottom up, and weak spatial model, we could improve other more
complicated architectures that currently produce the best results. This mirrors
what many other researchers, like those in the speech recognition, object
recognition, and other domains have experienced
Stereo Pictorial Structure for 2D Articulated Human Pose Estimation
In this paper, we consider the problem of 2D human
pose estimation on stereo image pairs. In particular,
we aim at estimating the location, orientation and scale of
upper-body parts of people detected in stereo image pairs
from realistic stereo videos that can be found in the Internet.
To address this task, we propose a novel pictorial structure
model to exploit the stereo information included in such
stereo image pairs: the Stereo Pictorial Structure (SPS). To
validate our proposed model, we contribute a new annotated
dataset of stereo image pairs, the Stereo Human Pose Estimation
Dataset (SHPED), obtained from YouTube stereoscopic
video sequences, depicting people in challenging poses
and diverse indoor and outdoor scenarios. The experimental
results on SHPED indicates that SPS improves on state-ofthe-
art monocular models thanks to the appropriate use of
the stereo informatio
- …