43,187 research outputs found

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    3D facial shape estimation from a single image under arbitrary pose and illumination.

    Get PDF
    Humans have the uncanny ability to perceive the world in three dimensions (3D), otherwise known as depth perception. The amazing thing about this ability to determine distances is that it depends only on a simple two-dimensional (2D) image in the retina. It is an interesting problem to explain and mimic this phenomenon of getting a three-dimensional perception of a scene from a flat 2D image of the retina. The main objective of this dissertation is the computational aspect of this human ability to reconstruct the world in 3D using only 2D images from the retina. Specifically, the goal of this work is to recover 3D facial shape information from a single image of unknown pose and illumination. Prior shape and texture models from real data, which are metric in nature, are incorporated into the 3D shape recovery framework. The output recovered shape, likewise, is metric, unlike previous shape-from-shading (SFS) approaches that only provide relative shape. This work starts first with the simpler case of general illumination and fixed frontal pose. Three optimization approaches were developed to solve this 3D shape recovery problem, starting from a brute-force iterative approach to a computationally efficient regression method (Method II-PCR), where the classical shape-from-shading equation is cast as a regression framework. Results show that the output of the regression-like approach is faster in timing and similar in error metrics when compared to its iterative counterpart. The best of the three algorithms above, Method II-PCR, is compared to its two predecessors, namely: (a) Castelan et al. [1] and (b) Ahmed et al. [2]. Experimental results show that the proposed method (Method II-PCR) is superior in all aspects compared to the previous state-of-the-art. Robust statistics was also incorporated into the shape recovery framework to deal with noise and occlusion. Using multiple-view geometry concepts [3], the fixed frontal pose was relaxed to arbitrary pose. The best of the three algorithms above, Method II-PCR, once again is used as the primary 3D shape recovery method. Results show that the pose-invariant 3D shape recovery version (for input with pose) has similar error values compared to the frontal-pose version (for input with frontal pose), for input images of the same subject. Sensitivity experiments indicate that the proposed method is, indeed, invariant to pose, at least for the pan angle range of (-50° to 50°). The next major part of this work is the development of 3D facial shape recovery methods, given only the input 2D shape information, instead of both texture and 2D shape. The simpler case of output 3D sparse shapes was dealt with, initially. The proposed method, which also use a regression-based optimization approach, was compared with state-of-the art algorithms, showing decent performance. There were five conclusions that drawn from the sparse experiments, namely, the proposed approach: (a) is competitive due to its linear and non-iterative nature, (b) does not need explicit training, as opposed to [4], (c) has comparable results to [4], at a shorter computational time, (d) better in all aspects than Zhang and Samaras [5], and (e) has the limitation, together with [4] and [5], in terms of the need to manually annotate the input 2D feature points. The proposed method was then extended to output 3D dense shapes simply by replacing the sparse model with its dense equivalent, in the regression framework inside the 3D face recovery approach. The numerical values of the mean height and surface orientation error indicate that even if shading information is unavailable, a decent 3D dense reconstruction is still possible

    Linguistically-driven framework for computationally efficient and scalable sign recognition

    Full text link
    We introduce a new general framework for sign recognition from monocular video using limited quantities of annotated data. The novelty of the hybrid framework we describe here is that we exploit state-of-the art learning methods while also incorporating features based on what we know about the linguistic composition of lexical signs. In particular, we analyze hand shape, orientation, location, and motion trajectories, and then use CRFs to combine this linguistically significant information for purposes of sign recognition. Our robust modeling and recognition of these sub-components of sign production allow an efficient parameterization of the sign recognition problem as compared with purely data-driven methods. This parameterization enables a scalable and extendable time-series learning approach that advances the state of the art in sign recognition, as shown by the results reported here for recognition of isolated, citation-form, lexical signs from American Sign Language (ASL)

    3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

    Full text link
    We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image. To alleviate the reconstruction ambiguity, a widely-used approach is to confine the unknown 3D shape within a shape space built upon existing shapes. While this approach has proven to be successful in various applications, a challenging issue remains, i.e., the joint estimation of shape parameters and camera-pose parameters requires to solve a nonconvex optimization problem. The existing methods often adopt an alternating minimization scheme to locally update the parameters, and consequently the solution is sensitive to initialization. In this paper, we propose a convex formulation to address this problem and develop an efficient algorithm to solve the proposed convex program. We demonstrate the exact recovery property of the proposed method, its merits compared to alternative methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201

    Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective

    Full text link
    This paper addresses the task of dense non-rigid structure-from-motion (NRSfM) using multiple images. State-of-the-art methods to this problem are often hurdled by scalability, expensive computations, and noisy measurements. Further, recent methods to NRSfM usually either assume a small number of sparse feature points or ignore local non-linearities of shape deformations, and thus cannot reliably model complex non-rigid deformations. To address these issues, in this paper, we propose a new approach for dense NRSfM by modeling the problem on a Grassmann manifold. Specifically, we assume the complex non-rigid deformations lie on a union of local linear subspaces both spatially and temporally. This naturally allows for a compact representation of the complex non-rigid deformation over frames. We provide experimental results on several synthetic and real benchmark datasets. The procured results clearly demonstrate that our method, apart from being scalable and more accurate than state-of-the-art methods, is also more robust to noise and generalizes to highly non-linear deformations.Comment: 10 pages, 7 figure, 4 tables. Accepted for publication in Conference on Computer Vision and Pattern Recognition (CVPR), 2018, typos fixed and acknowledgement adde

    Learning quadrangulated patches for 3D shape parameterization and completion

    Full text link
    We propose a novel 3D shape parameterization by surface patches, that are oriented by 3D mesh quadrangulation of the shape. By encoding 3D surface detail on local patches, we learn a patch dictionary that identifies principal surface features of the shape. Unlike previous methods, we are able to encode surface patches of variable size as determined by the user. We propose novel methods for dictionary learning and patch reconstruction based on the query of a noisy input patch with holes. We evaluate the patch dictionary towards various applications in 3D shape inpainting, denoising and compression. Our method is able to predict missing vertices and inpaint moderately sized holes. We demonstrate a complete pipeline for reconstructing the 3D mesh from the patch encoding. We validate our shape parameterization and reconstruction methods on both synthetic shapes and real world scans. We show that our patch dictionary performs successful shape completion of complicated surface textures.Comment: To be presented at International Conference on 3D Vision 2017, 201
    corecore