Search CORE

421 research outputs found

Human Pose Estimation from Monocular Images : a Comprehensive Survey

Author: Bouwmans Thierry
Gong Wenjuan
Gonzàlez i Sabaté Jordi
Sobral Andrews
Tu Changhe
Zahzah El-hadi
Zhang Xuena
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation can be found in the literature, but they focus on a certain category; for example, model-based approaches or human motion analysis, etc. As far as we know, an overall review of this problem domain has yet to be provided. Furthermore, recent advancements based on deep learning have brought novel algorithms for this problem. In this paper, a comprehensive survey of human pose estimation from monocular images is carried out including milestone works and recent advancements. Based on one standard pipeline for the solution of computer vision problems, this survey splits the problema into several modules: feature extraction and description, human body models, and modelin methods. Problem modeling methods are approached based on two means of categorization in this survey. One way to categorize includes top-down and bottom-up methods, and another way includes generative and discriminative methods. Considering the fact that one direct application of human pose estimation is to provide initialization for automatic video surveillance, there are additional sections for motion-related methods in all modules: motion features, motion models, and motion-based methods. Finally, the paper also collects 26 publicly available data sets for validation and provides error measurement methods that are frequently used

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

Diposit Digital de Documents de la UAB

Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation

Author: Horst Haussecker
Leonid Sigal
Michael Isard
Michael J. Black
Publication venue: Springer Nature
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector

MPG.PuRe

Survey on 2D and 3D human pose recovery

Author: Angulo Bahón Cecilio
Escalera Guerrero Sergio
Perez Sala Xavier
Publication venue: 'IOS Press'
Publication date: 01/01/2012
Field of study

Human Pose Recovery approaches have been studied in the eld of Computer Vision for the last 40 years. Several approaches have been reported, and signi cant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we de ne a global taxonomy to group the model based methods and discuss their main advantages and drawbacks.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

On the 3D point cloud for human-pose estimation

Author: Chan Kai-Chi
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

This thesis aims at investigating methodologies for estimating a human pose from a 3D point cloud that is captured by a static depth sensor. Human-pose estimation (HPE) is important for a range of applications, such as human-robot interaction, healthcare, surveillance, and so forth. Yet, HPE is challenging because of the uncertainty in sensor measurements and the complexity of human poses. In this research, we focus on addressing challenges related to two crucial components in the estimation process, namely, human-pose feature extraction and human-pose modeling. In feature extraction, the main challenge involves reducing feature ambiguity. We propose a 3D-point-cloud feature called viewpoint and shape feature histogram (VISH) to reduce feature ambiguity by capturing geometric properties of the 3D point cloud of a human. The feature extraction consists of three steps: 3D-point-cloud pre-processing, hierarchical structuring, and feature extraction. In the pre-processing step, 3D points corresponding to a human are extracted and outliers from the environment are removed to retain the 3D points of interest. This step is important because it allows us to reduce the number of 3D points by keeping only those points that correspond to the human body for further processing. In the hierarchical structuring, the pre-processed 3D point cloud is partitioned and replicated into a tree structure as nodes. Viewpoint feature histogram (VFH) and shape features are extracted from each node in the tree to provide a descriptor to represent each node. As the features are obtained based on histograms, coarse-level details are highlighted in large regions and fine-level details are highlighted in small regions. Therefore, the features from the point cloud in the tree can capture coarse level to fine level information to reduce feature ambiguity. In human-pose modeling, the main challenges involve reducing the dimensionality of human-pose space and designing appropriate factors that represent the underlying probability distributions for estimating human poses. To reduce the dimensionality, we propose a non-parametric action-mixture model (AMM). It represents high-dimensional human-pose space using low-dimensional manifolds in searching human poses. In each manifold, a probability distribution is estimated based on feature similarity. The distributions in the manifolds are then redistributed according to the stationary distribution of a Markov chain that models the frequency of human actions. After the redistribution, the manifolds are combined according to a probability distribution determined by action classification. Experiments were conducted using VISH features as input to the AMM. The results showed that the overall error and standard deviation of the AMM were reduced by about 7.9% and 7.1%, respectively, compared with a model without action classification. To design appropriate factors, we consider the AMM as a Bayesian network and propose a mapping that converts the Bayesian network to a neural network called NN-AMM. The proposed mapping consists of two steps: structure identification and parameter learning. In structure identification, we have developed a bottom-up approach to build a neural network while preserving the Bayesian-network structure. In parameter learning, we have created a part-based approach to learn synaptic weights by decomposing a neural network into parts. Based on the concept of distributed representation, the NN-AMM is further modified into a scalable neural network called NND-AMM. A neural-network-based system is then built by using VISH features to represent 3D-point-cloud input and the NND-AMM to estimate 3D human poses. The results showed that the proposed mapping can be utilized to design AMM factors automatically. The NND-AMM can provide more accurate human-pose estimates with fewer hidden neurons than both the AMM and NN-AMM can. Both the NN-AMM and NND-AMM can adapt to different types of input, showing the advantage of using neural networks to design factors

Purdue E-Pubs

A Deep-structured Conditional Random Field Model for Object Silhouette Tracking

Author: Azimifar Zohreh
Shafiee Mohammad
Wong Alexander
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/08/2015
Field of study

In this work, we introduce a deep-structured conditional random field (DS-CRF) model for the purpose of state-based object silhouette tracking. The proposed DS-CRF model consists of a series of state layers, where each state layer spatially characterizes the object silhouette at a particular point in time. The interactions between adjacent state layers are established by inter-layer connectivity dynamically determined based on inter-frame optical flow. By incorporate both spatial and temporal context in a dynamic fashion within such a deep-structured probabilistic graphical model, the proposed DS-CRF model allows us to develop a framework that can accurately and efficiently track object silhouettes that can change greatly over time, as well as under different situations such as occlusion and multiple targets within the scene. Experiment results using video surveillance datasets containing different scenarios such as occlusion and multiple targets showed that the proposed DS-CRF approach provides strong object silhouette tracking performance when compared to baseline methods such as mean-shift tracking, as well as state-of-the-art methods such as context tracking and boosted particle filtering.Comment: 17 page

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Action Recognition in Videos: from Motion Capture Labs to the Web

Author: Ana Paula Br
Arnaldo Albuquerque De Araújo
De Almeida
Eduardo Alves
Jussara Marques
Publication venue
Publication date: 17/06/2010
Field of study

This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

arXiv.org e-Print Archive

CiteSeerX

Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation

Author: A. Agarwal
A. Balan
A. Banerjee
A. Doucet
A. Doucet
A. Elgammal
A. Opelt
A. T. Ihler
B. Rosenhahn
C. Bregler
C. Sminchisescu
C. Sminchisescu
C. Sminchisescu
C.-S. Lee
D. A. Forsyth
D. C. Hogg
D. Comaniciu
D. Gavrila
D. Gavrila
D. Knossow
D. Koller
D. Marr
D. Ramanan
D. Ramanan
E. Sudderth
E. Sudderth
G. Cooper
G. E. Hinton
G. Elidan
G. Hua
G. K. M. Cheung
G. Mori
G. Shakhnarovich
G. Wywill
H. Sidenbladh
H. Sidenbladh
Horst Haussecker
I. A. Kakadiaris
J. Canny
J. Deutscher
J. Deutscher
J. Deutscher
J. Foley
J. Gall
J. Gall
J. MacCormick
J. Rodgers
J. Sun
K. Choo
K. Grauman
K. Kinoshita
L. Bo
L. Sigal
L. Sigal
L. Sigal
L. Sigal
L. Sigal
Leonid Sigal
M. Andriluka
M. Bergtholdt
M. Eichner
M. Fischler
M. I. Jordan
M. Isard
M. Siddiqui
M. Weber
Michael Isard
Michael J. Black
P. Felzenszwalb
P. Guan
P. Viola
P. Wang
R. Fergus
R. Kehl
R. Li
R. Li
R. Navaratnam
R. Nevatia
R. Poppe
R. Rosales
R. Rosales
R. Urtasun
R. Urtasun
R. W. Poppe
S. Bhatia
S. Corazza
S. Ioffe
S. Ioffe
S. Ju
S. Wachter
S. Yonemoto
T. Moeslund
T.-J. Cham
T.-P. Tian
V. John
W. G. Cochran
X. Lan
X. Xu
Y. Weiss
Y. Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref