36,271 research outputs found
Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking
In this paper, we propose a generative framework that unifies depth-based 3D
facial pose tracking and face model adaptation on-the-fly, in the unconstrained
scenarios with heavy occlusions and arbitrary facial expression variations.
Specifically, we introduce a statistical 3D morphable model that flexibly
describes the distribution of points on the surface of the face model, with an
efficient switchable online adaptation that gradually captures the identity of
the tracked subject and rapidly constructs a suitable face model when the
subject changes. Moreover, unlike prior art that employed ICP-based facial pose
estimation, to improve robustness to occlusions, we propose a ray visibility
constraint that regularizes the pose based on the face model's visibility with
respect to the input point cloud. Ablation studies and experimental results on
Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective
and outperforms completing state-of-the-art depth-based methods
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs
We address the problem of making human motion capture in the wild more
practical by using a small set of inertial sensors attached to the body. Since
the problem is heavily under-constrained, previous methods either use a large
number of sensors, which is intrusive, or they require additional video input.
We take a different approach and constrain the problem by: (i) making use of a
realistic statistical body model that includes anthropometric constraints and
(ii) using a joint optimization framework to fit the model to orientation and
acceleration measurements over multiple frames. The resulting tracker Sparse
Inertial Poser (SIP) enables 3D human pose estimation using only 6 sensors
(attached to the wrists, lower legs, back and head) and works for arbitrary
human motions. Experiments on the recently released TNT15 dataset show that,
using the same number of sensors, SIP achieves higher accuracy than the dataset
baseline without using any video data. We further demonstrate the effectiveness
of SIP on newly recorded challenging motions in outdoor scenarios such as
climbing or jumping over a wall.Comment: 12 pages, Accepted at Eurographics 201
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
Better Feature Tracking Through Subspace Constraints
Feature tracking in video is a crucial task in computer vision. Usually, the
tracking problem is handled one feature at a time, using a single-feature
tracker like the Kanade-Lucas-Tomasi algorithm, or one of its derivatives.
While this approach works quite well when dealing with high-quality video and
"strong" features, it often falters when faced with dark and noisy video
containing low-quality features. We present a framework for jointly tracking a
set of features, which enables sharing information between the different
features in the scene. We show that our method can be employed to track
features for both rigid and nonrigid motions (possibly of few moving bodies)
even when some features are occluded. Furthermore, it can be used to
significantly improve tracking results in poorly-lit scenes (where there is a
mix of good and bad features). Our approach does not require direct modeling of
the structure or the motion of the scene, and runs in real time on a single CPU
core.Comment: 8 pages, 2 figures. CVPR 201
A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"
Recently, technologies such as face detection, facial landmark localisation
and face recognition and verification have matured enough to provide effective
and efficient solutions for imagery captured under arbitrary conditions
(referred to as "in-the-wild"). This is partially attributed to the fact that
comprehensive "in-the-wild" benchmarks have been developed for face detection,
landmark localisation and recognition/verification. A very important technology
that has not been thoroughly evaluated yet is deformable face tracking
"in-the-wild". Until now, the performance has mainly been assessed
qualitatively by visually assessing the result of a deformable face tracking
technology on short videos. In this paper, we perform the first, to the best of
our knowledge, thorough evaluation of state-of-the-art deformable face tracking
pipelines using the recently introduced 300VW benchmark. We evaluate many
different architectures focusing mainly on the task of on-line deformable face
tracking. In particular, we compare the following general strategies: (a)
generic face detection plus generic facial landmark localisation, (b) generic
model free tracking plus generic facial landmark localisation, as well as (c)
hybrid approaches using state-of-the-art face detection, model free tracking
and facial landmark localisation technologies. Our evaluation reveals future
avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second
authorshi
Constrained Statistical Modelling of Knee Flexion from Multi-Pose Magnetic Resonance Imaging
© 1982-2012 IEEE.Reconstruction of the anterior cruciate ligament (ACL) through arthroscopy is one of the most common procedures in orthopaedics. It requires accurate alignment and drilling of the tibial and femoral tunnels through which the ligament graft is attached. Although commercial computer-Assisted navigation systems exist to guide the placement of these tunnels, most of them are limited to a fixed pose without due consideration of dynamic factors involved in different knee flexion angles. This paper presents a new model for intraoperative guidance of arthroscopic ACL reconstruction with reduced error particularly in the ligament attachment area. The method uses 3D preoperative data at different flexion angles to build a subject-specific statistical model of knee pose. To circumvent the problem of limited training samples and ensure physically meaningful pose instantiation, homogeneous transformations between different poses and local-deformation finite element modelling are used to enlarge the training set. Subsequently, an anatomical geodesic flexion analysis is performed to extract the subject-specific flexion characteristics. The advantages of the method were also tested by detailed comparison to standard Principal Component Analysis (PCA), nonlinear PCA without training set enlargement, and other state-of-The-Art articulated joint modelling methods. The method yielded sub-millimetre accuracy, demonstrating its potential clinical value
Video Interpolation using Optical Flow and Laplacian Smoothness
Non-rigid video interpolation is a common computer vision task. In this paper
we present an optical flow approach which adopts a Laplacian Cotangent Mesh
constraint to enhance the local smoothness. Similar to Li et al., our approach
adopts a mesh to the image with a resolution up to one vertex per pixel and
uses angle constraints to ensure sensible local deformations between image
pairs. The Laplacian Mesh constraints are expressed wholly inside the optical
flow optimization, and can be applied in a straightforward manner to a wide
range of image tracking and registration problems. We evaluate our approach by
testing on several benchmark datasets, including the Middlebury and Garg et al.
datasets. In addition, we show application of our method for constructing 3D
Morphable Facial Models from dynamic 3D data
- …