79,058 research outputs found
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
We present the first real-time method to capture the full global 3D skeletal
pose of a human in a stable, temporally consistent manner using a single RGB
camera. Our method combines a new convolutional neural network (CNN) based pose
regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
formulation regresses 2D and 3D joint positions jointly in real time and does
not require tightly cropped input frames. A real-time kinematic skeleton
fitting method uses the CNN output to yield temporally stable 3D global pose
reconstructions on the basis of a coherent kinematic skeleton. This makes our
approach the first monocular RGB method usable in real-time applications such
as 3D character control---thus far, the only monocular methods for such
applications employed specialized RGB-D cameras. Our method's accuracy is
quantitatively on par with the best offline 3D monocular RGB pose estimation
methods. Our results are qualitatively comparable to, and sometimes better
than, results from monocular RGB-D approaches, such as the Kinect. However, we
show that our approach is more broadly applicable than RGB-D solutions, i.e. it
works for outdoor scenes, community videos, and low quality commodity RGB
cameras.Comment: Accepted to SIGGRAPH 201
Online Discrimination of Nonlinear Dynamics with Switching Differential Equations
How to recognise whether an observed person walks or runs? We consider a
dynamic environment where observations (e.g. the posture of a person) are
caused by different dynamic processes (walking or running) which are active one
at a time and which may transition from one to another at any time. For this
setup, switching dynamic models have been suggested previously, mostly, for
linear and nonlinear dynamics in discrete time. Motivated by basic principles
of computations in the brain (dynamic, internal models) we suggest a model for
switching nonlinear differential equations. The switching process in the model
is implemented by a Hopfield network and we use parametric dynamic movement
primitives to represent arbitrary rhythmic motions. The model generates
observed dynamics by linearly interpolating the primitives weighted by the
switching variables and it is constructed such that standard filtering
algorithms can be applied. In two experiments with synthetic planar motion and
a human motion capture data set we show that inference with the unscented
Kalman filter can successfully discriminate several dynamic processes online
Recommended from our members
Use of 3D body motion to freeform surface design
This paper presents a novel surface modelling approach by utilising a 3D motion capture system. For designing a large-sized surface, a network of splines is initially set up. Artists or designers wearing motion markers on their hands can then change shapes of the splines with their hands. Literarily they can move their bodies freely to any positions to perform their tasks. They can also move their hands in 3D free space to detail surface characteristics by their gestures. All their design motions are recorded in the motion capturing system and transferred into 3D curves and surfaces correspondingly. This paper reports this novel surface design method and some case studies
RLFC: Random Access Light Field Compression using Key Views and Bounded Integer Encoding
We present a new hierarchical compression scheme for encoding light field
images (LFI) that is suitable for interactive rendering. Our method (RLFC)
exploits redundancies in the light field images by constructing a tree
structure. The top level (root) of the tree captures the common high-level
details across the LFI, and other levels (children) of the tree capture
specific low-level details of the LFI. Our decompressing algorithm corresponds
to tree traversal operations and gathers the values stored at different levels
of the tree. Furthermore, we use bounded integer sequence encoding which
provides random access and fast hardware decoding for compressing the blocks of
children of the tree. We have evaluated our method for 4D two-plane
parameterized light fields. The compression rates vary from 0.08 - 2.5 bits per
pixel (bpp), resulting in compression ratios of around 200:1 to 20:1 for a PSNR
quality of 40 to 50 dB. The decompression times for decoding the blocks of LFI
are 1 - 3 microseconds per channel on an NVIDIA GTX-960 and we can render new
views with a resolution of 512X512 at 200 fps. Our overall scheme is simple to
implement and involves only bit manipulations and integer arithmetic
operations.Comment: Accepted for publication at Symposium on Interactive 3D Graphics and
Games (I3D '19
- …