9,002 research outputs found

    What Does 2D Geometric Information Really Tell Us About 3D Face Shape?

    Get PDF
    A face image contains geometric cues in the form of configurational information (semantically meaningful landmark points and contours). In this thesis, we explore to what degree such 2D geometric information allows us to estimate 3D face shape. First, we focus on the problem of fitting a 3D morphable model to single face images using only sparse geometric features. We propose a novel approach that explicitly computes hard correspondences which allow us to treat the model edge vertices as known 2D positions, for which optimal pose or shape estimates can be linearly computed. Moreover, we show how to formulate this shape-from-landmarks problem as a separable nonlinear least squares optimisation. Second, we show how a statistical model can be used to spatially transform input data as a module within a convolutional neural network. This is an extension of the original spatial transformer network in that we are able to interpret and normalise 3D pose changes and self-occlusions. We show that the localiser can be trained using only simple geometric loss functions on a relatively small dataset yet is able to perform robust normalisation on highly uncontrolled images. We consider another extension in which the model itself is also learnt. The final contribution of this thesis lies in exploring the limits of 2D geometric features and characterising the resulting ambiguities. 2D geometric information only provides a partial constraint on 3D face shape. In other words, face landmarks or occluding contours are an ambiguous shape cue. Two faces with different 3D shape can give rise to the same 2D geometry, particularly as a result of perspective transformation when camera distance varies. We derive methods to compute these ambiguity subspaces, demonstrate that they contain significant shape variability and show that these ambiguities occur in real-world datasets

    What Does 2D Geometric Information Really Tell Us About 3D Face Shape?

    Get PDF
    A face image contains geometric cues in the form of configurational information and contours that can be used to estimate 3D face shape. While it is clear that 3D reconstruction from 2D points is highly ambiguous if no further constraints are enforced, one might expect that the face-space constraint solves this problem. We show that this is not the case and that geometric information is an ambiguous cue. There are two sources for this ambiguity. The first is that, within the space of 3D face shapes, there are flexibility modes that remain when some parts of the face are fixed. The second occurs only under perspective projection and is a result of perspective transformation as camera distance varies. Two different faces, when viewed at different distances, can give rise to the same 2D geometry. To demonstrate these ambiguities, we develop new algorithms for fitting a 3D morphable model to 2D landmarks or contours under either orthographic or perspective projection and show how to compute flexibility modes for both cases. We show that both fitting problems can be posed as a separable nonlinear least squares problem and solved efficiently. We demonstrate both quantitatively and qualitatively that the ambiguity is present in reconstructions from geometric information alone but also in reconstructions from a state-of-the-art CNN-based method

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    End-to-end Recovery of Human Shape and Pose

    Full text link
    We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allow our model to be trained using images in-the-wild that only have ground truth 2D annotations. However, the reprojection loss alone leaves the model highly under constrained. In this work we address this problem by introducing an adversary trained to tell whether a human body parameter is real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.Comment: CVPR 2018, Project page with code: https://akanazawa.github.io/hmr

    Looking at the Lanham Act: Images in Trademark and Advertising Law

    Get PDF
    Words are the prototypical regulatory subjects for trademark and advertising law, despite our increasingly audiovisual economy. This word-focused baseline means that the Lanham Act often misconceives its object, resulting in confusion and incoherence. This Article explores some of the ways courts have attempted to fit images into a word-centric model, while not fully recognizing the particular ways in which images make meaning in trademark and other forms of advertising. While problems interpreting images are likely to persist, this Article suggests some ways in which courts could pay closer attention to the special features of images as compared to words
    corecore