22 research outputs found

    Template-based Monocular 3D Recovery of Elastic Shapes using Lagrangian Multipliers

    Get PDF
    International audienceWe present in this paper an efficient template-based method for 3D recovery of elastic shapes from a fixed monocular camera. By exploiting the object's elasticity, in contrast to isometric methods that use inextensibility constraints , a large range of deformations can be handled. Our method is expressed as a saddle point problem using La-grangian multipliers resulting in a linear system which unifies both mechanical and optical constraints and integrates Dirichlet boundary conditions, whether they are fixed or free. We experimentally show that no prior knowledge on material properties is needed, which exhibit the generic usability of our method with elastic and inelastic objects with different kinds of materials. Comparisons with existing techniques are conducted on synthetic and real elastic objects with strains ranging from 25% to 130% resulting to low errors

    Learning Dense 3D Models from Monocular Video

    Get PDF
    Reconstructing dense, detailed, 3D shape of dynamic scenes from monocular sequences is a challenging problem in computer vision. While robust and even real-time solutions exist to this problem if the observed scene is static, for non-rigid dense shape capture current systems are typically restricted to the use of complex multi-camera rigs, taking advantage of the additional depth channel available in RGB-D cameras, or dealing with specific shapes such as faces or planar surfaces. In this thesis, we present two pieces of work for reconstructing dense generic shapes from monocular sequences. In the first work, we propose an unsupervised approach to the challenging problem of simultaneously segmenting the scene into its constituent objects and reconstructing a 3D model of the scene. The strength of our approach comes from the ability to deal with real-world dynamic scenes and to handle seamlessly different types of motion: rigid, articulated and non-rigid. We formulate the problem as a hierarchical graph-cuts based segmentation where we decompose the whole scene into background and foreground objects and model the complex motion of non-rigid or articulated objects as a set of overlapping rigid parts. To validate the capability of our approach to deal with real-world scenes, we provide 3D reconstructions of some challenging videos from the YouTube Objects and KITTI dataset, etc. In the second work, we propose a direct approach for capturing the dense, detailed 3D geometry of generic, complex non-rigid meshes using a single camera. Our method makes use of a single RGB video as input; it can capture the deformations of generic shapes; and the depth estimation is dense, per-pixel and direct. We first reconstruct a dense 3D template of the shape of the object, using a short rigid sequence, and subsequently perform online reconstruction of the non-rigid mesh as it evolves over time. In our experimental evaluation, we show a range of qualitative results on novel datasets and quantitative comparison results with stereo reconstruction

    Resolving Ambiguities in Monocular 3D Reconstruction of Deformable Surfaces

    Get PDF
    In this thesis, we focus on the problem of recovering 3D shapes of deformable surfaces from a single camera. This problem is known to be ill-posed as for a given 2D input image there exist many 3D shapes that give visually identical projections. We present three methods which make headway towards resolving these ambiguities. We believe that our work represents a significant step towards making surface reconstruction methods of practical use. First, we propose a surface reconstruction method that overcomes the limitations of the state-of-the-art template-based and non-rigid structure from motion methods. We neither track points over many frames, nor require a sophisticated deformation model, or depend on a reference image. In our method, we establish correspondences between pairs of frames in which the shape is different and unknown. We then estimate homographies between corresponding local planar patches in both images. These yield approximate 3D reconstructions of points within each patch up to a scale factor. Since we consider overlapping patches, we can enforce them to be consistent over the whole surface. Finally, a local deformation model is used to fit a triangulated mesh to the 3D point cloud, which makes the reconstruction robust to both noise and outliers in the image data. Second, we propose a novel approach to recovering the 3D shape of a deformable surface from a monocular input by taking advantage of shading information in more generic contexts than conventional Shape-from-Shading (SfS) methods. This includes surfaces that may be fully or partially textured and lit by arbitrarily many light sources. To this end, given a lighting model, we learn the relationship between a shading pattern and the corresponding local surface shape. At run time, we first use this knowledge to recover the shape of surface patches and then enforce spatial consistency between the patches to produce a global 3D shape. Instead of treating texture as noise as in many SfS approaches, we exploit it as an additional source of information. We validate our approach quantitatively and qualitatively using both synthetic and real data. Third, we introduce a constrained latent variable model that inherently accounts for geometric constraints such as inextensibility defined on the mesh model. To this end, we learn a non-linear mapping from the latent space to the output space, which corresponds to vertex positions of a mesh model, such that the generated outputs comply with equality and inequality constraints expressed in terms of the problem variables. Since its output is encouraged to satisfy such constraints inherently, using our model removes the need for computationally expensive methods that enforce these constraints at run time. In addition, our approach is completely generic and could be used in many other different contexts as well, such as image classification to impose separation of the classes, and articulated tracking to constrain the space of possible poses

    Facial Texture Super-Resolution by Fitting 3D Face Models

    Get PDF
    This book proposes to solve the low-resolution (LR) facial analysis problem with 3D face super-resolution (FSR). A complete processing chain is presented towards effective 3D FSR in real world. To deal with the extreme challenges of incorporating 3D modeling under the ill-posed LR condition, a novel workflow coupling automatic localization of 2D facial feature points and 3D shape reconstruction is developed, leading to a robust pipeline for pose-invariant hallucination of the 3D facial texture

    Deep Structured Layers for Instance-Level Optimization in 2D and 3D Vision

    Get PDF
    The approach we present in this thesis is that of integrating optimization problems as layers in deep neural networks. Optimization-based modeling provides an additional set of tools enabling the design of powerful neural networks for a wide battery of computer vision tasks. This thesis shows formulations and experiments for vision tasks ranging from image reconstruction to 3D reconstruction. We first propose an unrolled optimization method with implicit regularization properties for reconstructing images from noisy camera readings. The method resembles an unrolled majorization minimization framework with convolutional neural networks acting as regularizers. We report state-of-the-art performance in image reconstruction on both noisy and noise-free evaluation setups across many datasets. We further focus on the task of monocular 3D reconstruction of articulated objects using video self-supervision. The proposed method uses a structured layer for accurate object deformation that controls a 3D surface by displacing a small number of learnable handles. While relying on a small set of training data per category for self-supervision, the method obtains state-of-the-art reconstruction accuracy with diverse shapes and viewpoints for multiple articulated objects. We finally address the shortcomings of the previous method that revolve around regressing the camera pose using multiple hypotheses. We propose a method that recovers a 3D shape from a 2D image by relying solely on 3D-2D correspondences regressed from a convolutional neural network. These correspondences are used in conjunction with an optimization problem to estimate per sample the camera pose and deformation. We quantitatively show the effectiveness of the proposed method on self-supervised 3D reconstruction on multiple categories without the need for multiple hypotheses

    Deformable shape matching

    Get PDF
    Deformable shape matching has become an important building block in academia as well as in industry. Given two three dimensional shapes A and B the deformation function f aligning A with B has to be found. The function is discretized by a set of corresponding point pairs. Unfortunately, the computation cost of a brute-force search of correspondences is exponential. Additionally, to be of any practical use the algorithm has to be able to deal with data coming directly from 3D scanner devices which suffers from acquisition problems like noise, holes as well as missing any information about topology. This dissertation presents novel solutions for solving shape matching: First, an algorithm estimating correspondences using a randomized search strategy is shown. Additionally, a planning step dramatically reducing the matching costs is incorporated. Using ideas of these both contributions, a method for matching multiple shapes at once is shown. The method facilitates the reconstruction of shape and motion from noisy data acquired with dynamic 3D scanners. Considering shape matching from another perspective a solution is shown using Markov Random Fields (MRF). Formulated as MRF, partial as well as full matches of a shape can be found. Here, belief propagation is utilized for inference computation in the MRF. Finally, an approach significantly reducing the space-time complexity of belief propagation for a wide spectrum of computer vision tasks is presented.Anpassung deformierbarer Formen ist zu einem wichtigen Baustein in der akademischen Welt sowie in der Industrie geworden. Gegeben zwei dreidimensionale Formen A und B, suchen wir nach einer Verformungsfunktion f, die die Deformation von A auf B abbildet. Die Funktion f wird durch eine Menge von korrespondierenden Punktepaaren diskretisiert. Leider sind die Berechnungskosten für eine Brute-Force-Suche dieser Korrespondenzen exponentiell. Um zusätzlich von einem praktischen Nutzen zu sein, muss der Suchalgorithmus in der Lage sein, mit Daten, die direkt aus 3D-Scanner kommen, umzugehen. Bedauerlicherweise leiden diese Daten unter Akquisitionsproblemen wie Rauschen, Löcher sowie fehlender Topologieinformation. In dieser Dissertation werden neue Lösungen für das Problem der Formanpassung präsentiert. Als erstes wird ein Algorithmus gezeigt, der die Korrespondenzen mittels einer randomisierten Suchstrategie schätzt. Zusätzlich wird anhand eines automatisch berechneten Schätzplanes die Geschwindigkeit der Suchstrategie verbessert. Danach wird ein Verfahren gezeigt, dass die Anpassung mehrerer Formen gleichzeitig bewerkstelligen kann. Diese Methode ermöglicht es, die Bewegung, sowie die eigentliche Struktur des Objektes aus verrauschten Daten, die mittels dynamischer 3D-Scanner aufgenommen wurden, zu rekonstruieren. Darauffolgend wird das Problem der Formanpassung aus einer anderen Perspektive betrachtet und als Markov-Netzwerk (MRF) reformuliert. Dieses ermöglicht es, die Formen auch stückweise aufeinander abzubilden. Die eigentliche Lösung wird mittels Belief Propagation berechnet. Schließlich wird ein Ansatz gezeigt, der die Speicher-Zeit-Komplexität von Belief Propagation für ein breites Spektrum von Computer-Vision Problemen erheblich reduziert

    DATA-DRIVEN FACIAL IMAGE SYNTHESIS FROM POOR QUALITY LOW RESOLUTION IMAGE

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Nonrigid Surface Tracking, Analysis and Evaluation

    Get PDF
    corecore