622 research outputs found
Markerless Facial Motion Capture
With the ever-rising capabilities of motion capture systems; this project explored markerless facial motion capture programs using the Kinect Sensor for Xbox. Many systems today still use markers and end up retargeting after a motion capture recording. This project used a simpler process of setting up and being able to display the effects live. An off-the-shelf system was built using a computer, a Kinect Sensor, a plug-in from Brekel, and Autodesk software. The first goal was to create a process that was able to capture and project live facial motion for fewer than 500 USD was considered to be more of a professional studio set-up. With an inexpensive setup, amateur users can do motion capture outside of a studio. The second goal was to observe the outcome of the audiences\u27 responses and see if interaction felt more mechanical than human
Rule Of Thumb: Deep derotation for improved fingertip detection
We investigate a novel global orientation regression approach for articulated
objects using a deep convolutional neural network. This is integrated with an
in-plane image derotation scheme, DeROT, to tackle the problem of per-frame
fingertip detection in depth images. The method reduces the complexity of
learning in the space of articulated poses which is demonstrated by using two
distinct state-of-the-art learning based hand pose estimation methods applied
to fingertip detection. Significant classification improvements are shown over
the baseline implementation. Our framework involves no tracking, kinematic
constraints or explicit prior model of the articulated object in hand. To
support our approach we also describe a new pipeline for high accuracy magnetic
annotation and labeling of objects imaged by a depth camera.Comment: To be published in proceedings of BMVC 201
Markerless structure-based multi-sensor calibration for free viewpoint video capture
Free-viewpoint capture technologies have recently started demonstrating impressive results. Being able to capture
human performances in full 3D is a very promising technology for a variety of applications. However, the setup
of the capturing infrastructure is usually expensive and requires trained personnel. In this work we focus on one
practical aspect of setting up a free-viewpoint capturing system, the spatial alignment of the sensors. Our work aims
at simplifying the external calibration process that typically requires significant human intervention and technical
knowledge. Our method uses an easy to assemble structure and unlike similar works, does not rely on markers or
features. Instead, we exploit the a-priori knowledge of the structure’s geometry to establish correspondences for
the little-overlapping viewpoints typically found in free-viewpoint capture setups. These establish an initial sparse
alignment that is then densely optimized. At the same time, our pipeline improves the robustness to assembly
errors, allowing for non-technical users to calibrate multi-sensor setups. Our results showcase the feasibility of our
approach that can make the tedious calibration process easier, and less error-prone
GANerated Hands for Real-time 3D Hand Tracking from Monocular RGB
We address the highly challenging problem of real-time 3D hand tracking based
on a monocular RGB-only sequence. Our tracking method combines a convolutional
neural network with a kinematic 3D hand model, such that it generalizes well to
unseen data, is robust to occlusions and varying camera viewpoints, and leads
to anatomically plausible as well as temporally smooth hand motions. For
training our CNN we propose a novel approach for the synthetic generation of
training data that is based on a geometrically consistent image-to-image
translation network. To be more specific, we use a neural network that
translates synthetic images to "real" images, such that the so-generated images
follow the same statistical distribution as real-world hand images. For
training this translation network we combine an adversarial loss and a
cycle-consistency loss with a geometric consistency loss in order to preserve
geometric properties (such as hand pose) during translation. We demonstrate
that our hand tracking system outperforms the current state-of-the-art on
challenging RGB-only footage
- …