9,002 research outputs found
What Does 2D Geometric Information Really Tell Us About 3D Face Shape?
A face image contains geometric cues in the form of configurational information (semantically meaningful landmark points and contours). In this thesis, we explore to what degree such 2D geometric information allows us to estimate 3D face shape. First, we focus on the problem of fitting a 3D morphable model to single face images using only sparse geometric features. We propose a novel approach that explicitly computes hard correspondences which allow us to treat the model edge vertices as known 2D positions, for which optimal pose or shape estimates can be linearly computed. Moreover, we show how to formulate this shape-from-landmarks problem as a separable nonlinear least squares optimisation. Second, we show how a statistical model can be used to spatially transform input data as a module within a convolutional neural network. This is an extension of the original spatial transformer network in that we are able to interpret and normalise 3D pose changes and self-occlusions. We show that the localiser can be trained using only simple geometric loss functions on a relatively small dataset yet is able to perform robust normalisation on highly uncontrolled images. We consider another extension in which the model itself is also learnt. The final contribution of this thesis lies in exploring the limits of 2D geometric features and characterising the resulting ambiguities. 2D geometric information only provides a partial constraint on 3D face shape. In other words, face landmarks or occluding contours are an ambiguous shape cue. Two faces with different 3D shape can give rise to the same 2D geometry, particularly as a result of perspective transformation when camera distance varies. We derive methods to compute these ambiguity subspaces, demonstrate that they contain significant shape variability and show that these ambiguities occur in real-world datasets
What Does 2D Geometric Information Really Tell Us About 3D Face Shape?
A face image contains geometric cues in the form of configurational
information and contours that can be used to estimate 3D face shape. While it
is clear that 3D reconstruction from 2D points is highly ambiguous if no
further constraints are enforced, one might expect that the face-space
constraint solves this problem. We show that this is not the case and that
geometric information is an ambiguous cue. There are two sources for this
ambiguity. The first is that, within the space of 3D face shapes, there are
flexibility modes that remain when some parts of the face are fixed. The second
occurs only under perspective projection and is a result of perspective
transformation as camera distance varies. Two different faces, when viewed at
different distances, can give rise to the same 2D geometry. To demonstrate
these ambiguities, we develop new algorithms for fitting a 3D morphable model
to 2D landmarks or contours under either orthographic or perspective projection
and show how to compute flexibility modes for both cases. We show that both
fitting problems can be posed as a separable nonlinear least squares problem
and solved efficiently. We demonstrate both quantitatively and qualitatively
that the ambiguity is present in reconstructions from geometric information
alone but also in reconstructions from a state-of-the-art CNN-based method
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
End-to-end Recovery of Human Shape and Pose
We describe Human Mesh Recovery (HMR), an end-to-end framework for
reconstructing a full 3D mesh of a human body from a single RGB image. In
contrast to most current methods that compute 2D or 3D joint locations, we
produce a richer and more useful mesh representation that is parameterized by
shape and 3D joint angles. The main objective is to minimize the reprojection
loss of keypoints, which allow our model to be trained using images in-the-wild
that only have ground truth 2D annotations. However, the reprojection loss
alone leaves the model highly under constrained. In this work we address this
problem by introducing an adversary trained to tell whether a human body
parameter is real or not using a large database of 3D human meshes. We show
that HMR can be trained with and without using any paired 2D-to-3D supervision.
We do not rely on intermediate 2D keypoint detections and infer 3D pose and
shape parameters directly from image pixels. Our model runs in real-time given
a bounding box containing the person. We demonstrate our approach on various
images in-the-wild and out-perform previous optimization based methods that
output 3D meshes and show competitive results on tasks such as 3D joint
location estimation and part segmentation.Comment: CVPR 2018, Project page with code: https://akanazawa.github.io/hmr
Looking at the Lanham Act: Images in Trademark and Advertising Law
Words are the prototypical regulatory subjects for trademark and advertising law, despite our increasingly audiovisual economy. This word-focused baseline means that the Lanham Act often misconceives its object, resulting in confusion and incoherence. This Article explores some of the ways courts have attempted to fit images into a word-centric model, while not fully recognizing the particular ways in which images make meaning in trademark and other forms of advertising. While problems interpreting images are likely to persist, this Article suggests some ways in which courts could pay closer attention to the special features of images as compared to words
- …