2,088 research outputs found
GANerated Hands for Real-time 3D Hand Tracking from Monocular RGB
We address the highly challenging problem of real-time 3D hand tracking based
on a monocular RGB-only sequence. Our tracking method combines a convolutional
neural network with a kinematic 3D hand model, such that it generalizes well to
unseen data, is robust to occlusions and varying camera viewpoints, and leads
to anatomically plausible as well as temporally smooth hand motions. For
training our CNN we propose a novel approach for the synthetic generation of
training data that is based on a geometrically consistent image-to-image
translation network. To be more specific, we use a neural network that
translates synthetic images to "real" images, such that the so-generated images
follow the same statistical distribution as real-world hand images. For
training this translation network we combine an adversarial loss and a
cycle-consistency loss with a geometric consistency loss in order to preserve
geometric properties (such as hand pose) during translation. We demonstrate
that our hand tracking system outperforms the current state-of-the-art on
challenging RGB-only footage
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
We present the first real-time method to capture the full global 3D skeletal
pose of a human in a stable, temporally consistent manner using a single RGB
camera. Our method combines a new convolutional neural network (CNN) based pose
regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
formulation regresses 2D and 3D joint positions jointly in real time and does
not require tightly cropped input frames. A real-time kinematic skeleton
fitting method uses the CNN output to yield temporally stable 3D global pose
reconstructions on the basis of a coherent kinematic skeleton. This makes our
approach the first monocular RGB method usable in real-time applications such
as 3D character control---thus far, the only monocular methods for such
applications employed specialized RGB-D cameras. Our method's accuracy is
quantitatively on par with the best offline 3D monocular RGB pose estimation
methods. Our results are qualitatively comparable to, and sometimes better
than, results from monocular RGB-D approaches, such as the Kinect. However, we
show that our approach is more broadly applicable than RGB-D solutions, i.e. it
works for outdoor scenes, community videos, and low quality commodity RGB
cameras.Comment: Accepted to SIGGRAPH 201
SPLODE: Semi-Probabilistic Point and Line Odometry with Depth Estimation from RGB-D Camera Motion
Active depth cameras suffer from several limitations, which cause incomplete
and noisy depth maps, and may consequently affect the performance of RGB-D
Odometry. To address this issue, this paper presents a visual odometry method
based on point and line features that leverages both measurements from a depth
sensor and depth estimates from camera motion. Depth estimates are generated
continuously by a probabilistic depth estimation framework for both types of
features to compensate for the lack of depth measurements and inaccurate
feature depth associations. The framework models explicitly the uncertainty of
triangulating depth from both point and line observations to validate and
obtain precise estimates. Furthermore, depth measurements are exploited by
propagating them through a depth map registration module and using a
frame-to-frame motion estimation method that considers 3D-to-2D and 2D-to-3D
reprojection errors, independently. Results on RGB-D sequences captured on
large indoor and outdoor scenes, where depth sensor limitations are critical,
show that the combination of depth measurements and estimates through our
approach is able to overcome the absence and inaccuracy of depth measurements.Comment: IROS 201
- …