2,798 research outputs found
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
Statistical Model of Shape Moments with Active Contour Evolution for Shape Detection and Segmentation
This paper describes a novel method for shape representation and robust image segmentation. The proposed method combines two well known methodologies, namely, statistical shape models and active contours implemented in level set framework. The shape detection is achieved by maximizing a posterior function that consists of a prior shape probability model and image likelihood function conditioned on shapes. The statistical shape model is built as a result of a learning process based on nonparametric probability estimation in a PCA reduced feature space formed by the Legendre moments of training silhouette images. A greedy strategy is applied to optimize the proposed cost function by iteratively evolving an implicit active contour in the image space and subsequent constrained optimization of the evolved shape in the reduced shape feature space. Experimental results presented in the paper demonstrate that the proposed method, contrary to many other active contour segmentation methods, is highly resilient to severe random and structural noise that could be present in the data
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
A survey on 2d object tracking in digital video
This paper presents object tracking methods in video.Different algorithms based on rigid, non rigid and articulated object tracking are studied. The goal of this article is to review the state-of-the-art tracking methods, classify them
into different categories, and identify new trends.It is often the case that tracking objects in consecutive frames is supported by a prediction scheme. Based on information extracted from previous frames and any high level information that can be obtained, the state (location) of the
object is predicted.An excellent framework for prediction is kalman filter, which additionally estimates prediction error.In complex scenes, instead of single hypothesis, multiple hypotheses using Particle filter can be used.Different
techniques are given for different types of constraints in video
Monocular Pose Estimation Based on Global and Local Features
The presented thesis work deals with several mathematical and practical aspects of the monocular pose estimation problem. Pose estimation means to estimate the position and orientation of a model object with respect to a camera used as a sensor element. Three main aspects of the pose estimation problem are considered. These are the model representations, correspondence search and pose computation. Free-form contours and surfaces are considered for the approaches presented in this work. The pose estimation problem and the global representation of free-form contours and surfaces are defined in the mathematical framework of the conformal geometric algebra (CGA), which allows a compact and linear modeling of the monocular pose estimation scenario. Additionally, a new local representation of these entities is presented which is also defined in CGA. Furthermore, it allows the extraction of local feature information of these models in 3D space and in the image plane. This local information is combined with the global contour information obtained from the global representations in order to improve the pose estimation algorithms. The main contribution of this work is the introduction of new variants of the iterative closest point (ICP) algorithm based on the combination of local and global features. Sets of compatible model and image features are obtained from the proposed local model representation of free-form contours. This allows to translate the correspondence search problem onto the image plane and to use the feature information to develop new correspondence search criteria. The structural ICP algorithm is defined as a variant of the classical ICP algorithm with additional model and image structural constraints. Initially, this new variant is applied to planar 3D free-form contours. Then, the feature extraction process is adapted to the case of free-form surfaces. This allows to define the correlation ICP algorithm for free-form surfaces. In this case, the minimal Euclidean distance criterion is replaced by a feature correlation measure. The addition of structural information in the search process results in better conditioned correspondences and therefore in a better computed pose. Furthermore, global information (position and orientation) is used in combination with the correlation ICP to simplify and improve the pre-alignment approaches for the monocular pose estimation. Finally, all the presented approaches are combined to handle the pose estimation of surfaces when partial occlusions are present in the image. Experiments made on synthetic and real data are presented to demonstrate the robustness and behavior of the new ICP variants in comparison with standard approaches
MonoPerfCap: Human Performance Capture from Monocular Video
We present the first marker-less approach for temporally coherent 3D
performance capture of a human with general clothing from monocular video. Our
approach reconstructs articulated human skeleton motion as well as medium-scale
non-rigid surface deformations in general scenes. Human performance capture is
a challenging problem due to the large range of articulation, potentially fast
motion, and considerable non-rigid deformations, even from multi-view data.
Reconstruction from monocular video alone is drastically more challenging,
since strong occlusions and the inherent depth ambiguity lead to a highly
ill-posed reconstruction problem. We tackle these challenges by a novel
approach that employs sparse 2D and 3D human pose detections from a
convolutional neural network using a batch-based pose estimation strategy.
Joint recovery of per-batch motion allows to resolve the ambiguities of the
monocular reconstruction problem based on a low dimensional trajectory
subspace. In addition, we propose refinement of the surface geometry based on
fully automatically extracted silhouettes to enable medium-scale non-rigid
alignment. We demonstrate state-of-the-art performance capture results that
enable exciting applications such as video editing and free viewpoint video,
previously infeasible from monocular video. Our qualitative and quantitative
evaluation demonstrates that our approach significantly outperforms previous
monocular methods in terms of accuracy, robustness and scene complexity that
can be handled.Comment: Accepted to ACM TOG 2018, to be presented on SIGGRAPH 201
- …