69 research outputs found
Global alignment of deformable objects captured by a single RGB-D camera
We present a novel global registration method for deformable objects captured using a single RGB-D camera. Our algorithm allows objects to undergo large non-rigid deformations, and achieves high quality results without constraining the actor's pose or camera motion. We compute the deformations of all the scans simultaneously by optimizing a global alignment problem to avoid the well-known loop closure problem, and use an as-rigid-as-possible constraint to eliminate the shrinkage problem of the deformed model. To attack large scale problems, we design a coarse-to-fine multi-resolution scheme, which also avoids the optimization being trapped into local minima. The proposed method is evaluated on public datasets and real datasets captured by an RGB-D sensor. Experimental results demonstrate that the proposed method obtains better results than the state-of-the-art methods
A survey on human performance capture and animation
With the rapid development of computing technology, three-dimensional (3D) human body
models and their dynamic motions are widely used in the digital entertainment industry. Human perfor-
mance mainly involves human body shapes and motions. Key research problems include how to capture
and analyze static geometric appearance and dynamic movement of human bodies, and how to simulate
human body motions with physical e�ects. In this survey, according to main research directions of human body performance capture and animation, we summarize recent advances in key research topics, namely
human body surface reconstruction, motion capture and synthesis, as well as physics-based motion sim-
ulation, and further discuss future research problems and directions. We hope this will be helpful for
readers to have a comprehensive understanding of human performance capture and animatio
MonoPerfCap: Human Performance Capture from Monocular Video
We present the first marker-less approach for temporally coherent 3D
performance capture of a human with general clothing from monocular video. Our
approach reconstructs articulated human skeleton motion as well as medium-scale
non-rigid surface deformations in general scenes. Human performance capture is
a challenging problem due to the large range of articulation, potentially fast
motion, and considerable non-rigid deformations, even from multi-view data.
Reconstruction from monocular video alone is drastically more challenging,
since strong occlusions and the inherent depth ambiguity lead to a highly
ill-posed reconstruction problem. We tackle these challenges by a novel
approach that employs sparse 2D and 3D human pose detections from a
convolutional neural network using a batch-based pose estimation strategy.
Joint recovery of per-batch motion allows to resolve the ambiguities of the
monocular reconstruction problem based on a low dimensional trajectory
subspace. In addition, we propose refinement of the surface geometry based on
fully automatically extracted silhouettes to enable medium-scale non-rigid
alignment. We demonstrate state-of-the-art performance capture results that
enable exciting applications such as video editing and free viewpoint video,
previously infeasible from monocular video. Our qualitative and quantitative
evaluation demonstrates that our approach significantly outperforms previous
monocular methods in terms of accuracy, robustness and scene complexity that
can be handled.Comment: Accepted to ACM TOG 2018, to be presented on SIGGRAPH 201
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
A novel joint points and silhouette-based method to estimate 3D human pose and shape
This paper presents a novel method for 3D human pose and shape estimation
from images with sparse views, using joint points and silhouettes, based on a
parametric model. Firstly, the parametric model is fitted to the joint points
estimated by deep learning-based human pose estimation. Then, we extract the
correspondence between the parametric model of pose fitting and silhouettes on
2D and 3D space. A novel energy function based on the correspondence is built
and minimized to fit parametric model to the silhouettes. Our approach uses
sufficient shape information because the energy function of silhouettes is
built from both 2D and 3D space. This also means that our method only needs
images from sparse views, which balances data used and the required prior
information. Results on synthetic data and real data demonstrate the
competitive performance of our approach on pose and shape estimation of the
human body.Comment: Accepted to ICPR 2020 3DHU worksho
AFFECT-PRESERVING VISUAL PRIVACY PROTECTION
The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding.
The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection.
The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously
Global 3D non-rigid registration of deformable objects using a single RGB-D camera
We present a novel global non-rigid registration method for dynamic 3D objects. Our method allows objects to undergo large non-rigid deformations, and achieves high quality results even with substantial pose change or camera motion between views. In addition, our method does not require a template prior and uses less raw data than tracking based methods since only a sparse set of scans is needed. We compute the deformations of all the scans simultaneously by optimizing a global alignment problem to avoid the well-known loop closure problem, and use an as-rigid-as-possible constraint to eliminate the shrinkage problem of the deformed shapes, especially near open boundaries of scans. To cope with large-scale problems, we design a coarse-to-fine multi-resolution scheme, which also avoids the optimization being trapped into local minima. The proposed method is evaluated on public datasets and real datasets captured by an RGB-D sensor. Experimental results demonstrate that the proposed method obtains better results than several state-of-the-art methods
- …