8,916 research outputs found
Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image
We propose a unified formulation for the problem of 3D human pose estimation
from a single raw RGB image that reasons jointly about 2D joint estimation and
3D pose reconstruction to improve both tasks. We take an integrated approach
that fuses probabilistic knowledge of 3D human pose with a multi-stage CNN
architecture and uses the knowledge of plausible 3D landmark locations to
refine the search for better 2D locations. The entire process is trained
end-to-end, is extremely efficient and obtains state- of-the-art results on
Human3.6M outperforming previous approaches both on 2D and 3D errors.Comment: Paper presented at CVPR 1
Quality-based Multimodal Classification Using Tree-Structured Sparsity
Recent studies have demonstrated advantages of information fusion based on
sparsity models for multimodal classification. Among several sparsity models,
tree-structured sparsity provides a flexible framework for extraction of
cross-correlated information from different sources and for enforcing group
sparsity at multiple granularities. However, the existing algorithm only solves
an approximated version of the cost functional and the resulting solution is
not necessarily sparse at group levels. This paper reformulates the
tree-structured sparse model for multimodal classification task. An accelerated
proximal algorithm is proposed to solve the optimization problem, which is an
efficient tool for feature-level fusion among either homogeneous or
heterogeneous sources of information. In addition, a (fuzzy-set-theoretic)
possibilistic scheme is proposed to weight the available modalities, based on
their respective reliability, in a joint optimization problem for finding the
sparsity codes. This approach provides a general framework for quality-based
fusion that offers added robustness to several sparsity-based multimodal
classification algorithms. To demonstrate their efficacy, the proposed methods
are evaluated on three different applications - multiview face recognition,
multimodal face recognition, and target classification.Comment: To Appear in 2014 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2014
- …