7,033 research outputs found
MonoPerfCap: Human Performance Capture from Monocular Video
We present the first marker-less approach for temporally coherent 3D
performance capture of a human with general clothing from monocular video. Our
approach reconstructs articulated human skeleton motion as well as medium-scale
non-rigid surface deformations in general scenes. Human performance capture is
a challenging problem due to the large range of articulation, potentially fast
motion, and considerable non-rigid deformations, even from multi-view data.
Reconstruction from monocular video alone is drastically more challenging,
since strong occlusions and the inherent depth ambiguity lead to a highly
ill-posed reconstruction problem. We tackle these challenges by a novel
approach that employs sparse 2D and 3D human pose detections from a
convolutional neural network using a batch-based pose estimation strategy.
Joint recovery of per-batch motion allows to resolve the ambiguities of the
monocular reconstruction problem based on a low dimensional trajectory
subspace. In addition, we propose refinement of the surface geometry based on
fully automatically extracted silhouettes to enable medium-scale non-rigid
alignment. We demonstrate state-of-the-art performance capture results that
enable exciting applications such as video editing and free viewpoint video,
previously infeasible from monocular video. Our qualitative and quantitative
evaluation demonstrates that our approach significantly outperforms previous
monocular methods in terms of accuracy, robustness and scene complexity that
can be handled.Comment: Accepted to ACM TOG 2018, to be presented on SIGGRAPH 201
A bayesian approach to simultaneously recover camera pose and non-rigid shape from monocular images
© . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/In this paper we bring the tools of the Simultaneous Localization and Map Building (SLAM) problem from a rigid to a deformable domain and use them to simultaneously recover the 3D shape of non-rigid surfaces and the sequence of poses of a moving camera. Under the assumption that the surface shape may be represented as a weighted sum of deformation modes, we show that the problem of estimating the modal weights along with the camera poses, can be probabilistically formulated as a maximum a posteriori estimate and solved using an iterative least squares optimization. In addition, the probabilistic formulation we propose is very general and allows introducing different constraints without requiring any extra complexity. As a proof of concept, we show that local inextensibility constraints that prevent the surface from stretching can be easily integrated.
An extensive evaluation on synthetic and real data, demonstrates that our method has several advantages over current non-rigid shape from motion approaches. In particular, we show that our solution is robust to large amounts of noise and outliers and that it does not need to track points over the whole sequence nor to use an initialization close from the ground truth.Peer ReviewedPostprint (author's final draft
Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View
We propose a method for predicting the 3D shape of a deformable surface from
a single view. By contrast with previous approaches, we do not need a
pre-registered template of the surface, and our method is robust to the lack of
texture and partial occlusions. At the core of our approach is a {\it
geometry-aware} deep architecture that tackles the problem as usually done in
analytic solutions: first perform 2D detection of the mesh and then estimate a
3D shape that is geometrically consistent with the image. We train this
architecture in an end-to-end manner using a large dataset of synthetic
renderings of shapes under different levels of deformation, material
properties, textures and lighting conditions. We evaluate our approach on a
test split of this dataset and available real benchmarks, consistently
improving state-of-the-art solutions with a significantly lower computational
time.Comment: Accepted at CVPR 201
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation from a Single Depth Map
3D hand shape and pose estimation from a single depth map is a new and
challenging computer vision problem with many applications. The
state-of-the-art methods directly regress 3D hand meshes from 2D depth images
via 2D convolutional neural networks, which leads to artefacts in the
estimations due to perspective distortions in the images. In contrast, we
propose a novel architecture with 3D convolutions trained in a
weakly-supervised manner. The input to our method is a 3D voxelized depth map,
and we rely on two hand shape representations. The first one is the 3D
voxelized grid of the shape which is accurate but does not preserve the mesh
topology and the number of mesh vertices. The second representation is the 3D
hand surface which is less accurate but does not suffer from the limitations of
the first representation. We combine the advantages of these two
representations by registering the hand surface to the voxelized hand shape. In
the extensive experiments, the proposed approach improves over the state of the
art by 47.8% on the SynHand5M dataset. Moreover, our augmentation policy for
voxelized depth maps further enhances the accuracy of 3D hand pose estimation
on real data. Our method produces visually more reasonable and realistic hand
shapes on NYU and BigHand2.2M datasets compared to the existing approaches.Comment: 10 pages, 8 figures, 5 tables, CVP
Recommended from our members
Orthogonality constrained gradient reconstruction for superconvergent linear functionals
The post-processing of the solution of variational problems discretized with Galerkin finite element methods is particularly useful for the computation of quantities of interest. Such quantities are generally expressed as linear functionals of the solution and the error of their approximation is bounded by the error of the solution itself. Several a posteriori recovery procedures have been developed over the years to improve the accuracy of post-processed results. Nonetheless such recovery methods usually deteriorate the convergence properties of linear functionals of the solution and, as a consequence, of the quantities of interest as well. The paper develops an enhanced gradient recovery scheme able to both preserve the good qualities of the recovered gradient and increase the accuracy and the convergence rates of linear functionals of the solution
Superconvergent patch recovery with constraints for three-dimensional contact problems within the Cartesian grid Finite Element Method
"This is the peer reviewed version of the following article: Navarro-Jiménez, José M., Héctor Navarro-García, Manuel Tur, and Juan J. Ródenas. 2019. Superconvergent Patch Recovery with Constraints for Three-dimensional Contact Problems within the Cartesian Grid Finite Element Method. International Journal for Numerical Methods in Engineering 121 (6). Wiley: 1297 1313. doi:10.1002/nme.6266, which has been published in final form at https://doi.org/10.1002/nme.6266. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving."[EN] The superconvergent patch recovery technique with constraints (SPR-C) consists in improving the accuracy of the recovered stresses obtained with the original SPR technique by considering known information about the exact solution, like the internal equilibrium equation, the compatibility equation or the Neumann boundary conditions, during the recovery process. In this paper the SPR-C is extended to consider the equilibrium around the contact area when solving contact problems with the Cartesian grid Finite Element Method. In the proposed method, the Finite Element stress fields of both bodies in contact are considered during the recovery process and the equilibrium is enforced by means of the continuity of tractions along the contact surface.The authors would like to thank Generalitat Valenciana (PROMETEO/2016/007), the Spanish Ministerio de Economía, Industria y Competitividad (DPI2017-89816-R), the Spanish Ministerio de Ciencia, Innovación y Universidades (FPU17/03993), and Universitat Politècnica de València (FPI2015) for the financial support to this work.Navarro-Jiménez, J.; Navarro-García, H.; Tur Valiente, M.; Ródenas, JJ. (2020). Superconvergent patch recovery with constraints for three-dimensional contact problems within the Cartesian grid Finite Element Method. International Journal for Numerical Methods in Engineering. 121(6):1297-1313. https://doi.org/10.1002/nme.6266S129713131216Wriggers, P. (2006). Computational Contact Mechanics. doi:10.1007/978-3-540-32609-0Marco, O., Sevilla, R., Zhang, Y., Ródenas, J. J., & Tur, M. (2015). Exact 3D boundary representation in finite element analysis based on Cartesian grids independent of the geometry. International Journal for Numerical Methods in Engineering, 103(6), 445-468. doi:10.1002/nme.4914Navarro-Jiménez, J. M., Tur, M., Albelda, J., & Ródenas, J. J. (2018). Large deformation frictional contact analysis with immersed boundary method. Computational Mechanics, 62(4), 853-870. doi:10.1007/s00466-017-1533-xMarco, O., Ródenas, J. J., Navarro-Jiménez, J. M., & Tur, M. (2017). Robust h-adaptive meshing strategy considering exact arbitrary CAD geometries in a Cartesian grid framework. Computers & Structures, 193, 87-109. doi:10.1016/j.compstruc.2017.08.004Ródenas, J. J., Tur, M., Fuenmayor, F. J., & Vercher, A. (2007). Improvement of the superconvergent patch recovery technique by the use of constraint equations: the SPR-C technique. International Journal for Numerical Methods in Engineering, 70(6), 705-727. doi:10.1002/nme.1903Zienkiewicz, O. C., & Zhu, J. Z. (1992). The superconvergent patch recovery (SPR) and adaptive finite element refinement. Computer Methods in Applied Mechanics and Engineering, 101(1-3), 207-224. doi:10.1016/0045-7825(92)90023-dRódenas, J. J., González-Estrada, O. A., Díez, P., & Fuenmayor, F. J. (2010). Accurate recovery-based upper error bounds for the extended finite element framework. Computer Methods in Applied Mechanics and Engineering, 199(37-40), 2607-2621. doi:10.1016/j.cma.2010.04.010Blacker, T., & Belytschko, T. (1994). Superconvergent patch recovery with equilibrium and conjoint interpolant enhancements. International Journal for Numerical Methods in Engineering, 37(3), 517-536. doi:10.1002/nme.1620370309Díez, P., José Ródenas, J., & Zienkiewicz, O. C. (2007). Equilibrated patch recovery error estimates: simple and accurate upper bounds of the error. International Journal for Numerical Methods in Engineering, 69(10), 2075-2098. doi:10.1002/nme.1837Nadal, E., Díez, P., Ródenas, J. J., Tur, M., & Fuenmayor, F. J. (2015). A recovery-explicit error estimator in energy norm for linear elasticity. Computer Methods in Applied Mechanics and Engineering, 287, 172-190. doi:10.1016/j.cma.2015.01.013Badia, S., Verdugo, F., & Martín, A. F. (2018). The aggregated unfitted finite element method for elliptic problems. Computer Methods in Applied Mechanics and Engineering, 336, 533-553. doi:10.1016/j.cma.2018.03.022Zienkiewicz, O. C., Zhu, J. Z., & Wu, J. (1993). Superconvergent patch recovery techniques - some further tests. Communications in Numerical Methods in Engineering, 9(3), 251-258. doi:10.1002/cnm.1640090309FUENMAYOR, F. J., & OLIVER, J. L. (1996). CRITERIA TO ACHIEVE NEARLY OPTIMAL MESHES IN THEh-ADAPTIVE FINITE ELEMENT METHOD. International Journal for Numerical Methods in Engineering, 39(23), 4039-4061. doi:10.1002/(sici)1097-0207(19961215)39:233.0.co;2-cBabuška, I., Strouboulis, T., & Upadhyay, C. . (1994). A model study of the quality of a posteriori error estimators for linear elliptic problems. Error estimation in the interior of patchwise uniform grids of triangles. Computer Methods in Applied Mechanics and Engineering, 114(3-4), 307-378. doi:10.1016/0045-7825(94)90177-
Learning to Reconstruct People in Clothing from a Single RGB Camera
We present a learning-based model to infer the personalized 3D shape of people from a few frames (1-8) of a monocular video in which the person is moving, in less than 10 seconds with a reconstruction accuracy of 5mm. Our model learns to predict the parameters of a statistical body model and instance displacements that add clothing and hair to the shape. The model achieves fast and accurate predictions based on two key design choices. First, by predicting shape in a canonical T-pose space, the network learns to encode the images of the person into pose-invariant latent codes, where the information is fused. Second, based on the observation that feed-forward predictions are fast but do not always align with the input images, we predict using both, bottom-up and top-down streams (one per view) allowing information to flow in both directions. Learning relies only on synthetic 3D data. Once learned, the model can take a variable number of frames as input, and is able to reconstruct shapes even from a single image with an accuracy of 6mm. Results on 3 different datasets demonstrate the efficacy and accuracy of our approach
- …