41,265 research outputs found
Recurrent 3D Pose Sequence Machines
3D human articulated pose recovery from monocular image sequences is very
challenging due to the diverse appearances, viewpoints, occlusions, and also
the human 3D pose is inherently ambiguous from the monocular imagery. It is
thus critical to exploit rich spatial and temporal long-range dependencies
among body joints for accurate 3D pose sequence prediction. Existing approaches
usually manually design some elaborate prior terms and human body kinematic
constraints for capturing structures, which are often insufficient to exploit
all intrinsic structures and not scalable for all scenarios. In contrast, this
paper presents a Recurrent 3D Pose Sequence Machine(RPSM) to automatically
learn the image-dependent structural constraint and sequence-dependent temporal
context by using a multi-stage sequential refinement. At each stage, our RPSM
is composed of three modules to predict the 3D pose sequences based on the
previously learned 2D pose representations and 3D poses: (i) a 2D pose module
extracting the image-dependent pose representations, (ii) a 3D pose recurrent
module regressing 3D poses and (iii) a feature adaption module serving as a
bridge between module (i) and (ii) to enable the representation transformation
from 2D to 3D domain. These three modules are then assembled into a sequential
prediction framework to refine the predicted poses with multiple recurrent
stages. Extensive evaluations on the Human3.6M dataset and HumanEva-I dataset
show that our RPSM outperforms all state-of-the-art approaches for 3D pose
estimation.Comment: Published in CVPR 201
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
We present the first real-time method to capture the full global 3D skeletal
pose of a human in a stable, temporally consistent manner using a single RGB
camera. Our method combines a new convolutional neural network (CNN) based pose
regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
formulation regresses 2D and 3D joint positions jointly in real time and does
not require tightly cropped input frames. A real-time kinematic skeleton
fitting method uses the CNN output to yield temporally stable 3D global pose
reconstructions on the basis of a coherent kinematic skeleton. This makes our
approach the first monocular RGB method usable in real-time applications such
as 3D character control---thus far, the only monocular methods for such
applications employed specialized RGB-D cameras. Our method's accuracy is
quantitatively on par with the best offline 3D monocular RGB pose estimation
methods. Our results are qualitatively comparable to, and sometimes better
than, results from monocular RGB-D approaches, such as the Kinect. However, we
show that our approach is more broadly applicable than RGB-D solutions, i.e. it
works for outdoor scenes, community videos, and low quality commodity RGB
cameras.Comment: Accepted to SIGGRAPH 201
Stratified decision forests for accurate anatomical landmark localization in cardiac images
Accurate localization of anatomical landmarks is an important step in medical imaging, as it provides useful prior information for subsequent image analysis and acquisition methods. It is particularly useful for initialization of automatic image analysis tools (e.g. segmentation and registration) and detection of scan planes for automated image acquisition. Landmark localization has been commonly performed using learning based approaches, such as classifier and/or regressor models. However, trained models may not generalize well in heterogeneous datasets when the images contain large differences due to size, pose and shape variations of organs. To learn more data-adaptive and patient specific models, we propose a novel stratification based training model, and demonstrate its use in a decision forest. The proposed approach does not require any additional training information compared to the standard model training procedure and can be easily integrated into any decision tree framework. The proposed method is evaluated on 1080 3D highresolution and 90 multi-stack 2D cardiac cine MR images. The experiments show that the proposed method achieves state-of-theart landmark localization accuracy and outperforms standard regression and classification based approaches. Additionally, the proposed method is used in a multi-atlas segmentation to create a fully automatic segmentation pipeline, and the results show that it achieves state-of-the-art segmentation accuracy
- …