70 research outputs found

    Multidimensional morphable models : a framework for representing and matching object classes

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (p. 129-133).by Michel Jeffrey Jones.Ph.D

    Light Field Morphable Models

    Get PDF
    Statistical shape and texture appearance models are powerful image representations, but previously had been restricted to 2D or simple 3D shapes. In this paper we present a novel 3D morphable model based on image-based rendering techniques, which can represent complex lighting conditions, structures, and surfaces. We describe how to construct a manifold of the multi-view appearance of an object class using light fields and show how to match a 2D image of an object to a point on this manifold. In turn we use the reconstructed light field to render novel views of the object. Our technique overcomes the limitations of polygon based appearance models and uses light fields that are acquired in real-time

    Trainable videorealistic speech animation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 53-58).I describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a pre-determined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned. The two key contributions of this work are * a variant of the multidimensional morphable model (MMM) [4] [26] [25] to synthesize new, previously unseen mouth configurations from a small set of mouth image prototypes, * a trajectory synthesis technique based on regularization, which is automatically trained from the recorded video corpus, and which is capable of synthesizing trajectories in MMM space corresponding to any desired utterance. Results are presented on a series of numerical and psychophysical experiments designed to evaluate the synthetic animations.by Tony Farid Ezzat.Ph.D

    Learning-Based Approach to Estimation of Morphable Model Parameters

    Get PDF
    We describe the key role played by partial evaluation in the Supercomputing Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputing Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at MIT, and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable

    Exploring Vision-Based Interfaces: How to Use Your Head in Dual Pointing Tasks

    Get PDF
    The utility of vision-based face tracking for dual pointing tasks is evaluated. We first describe a 3-D face tracking technique based on real-time parametric motion-stereo, which is non-invasive, robust, and self-initialized. The tracker provides a real-time estimate of a ?frontal face ray? whose intersection with the display surface plane is used as a second stream of input for scrolling or pointing, in paral-lel with hand input. We evaluated the performance of com-bined head/hand input on a box selection and coloring task: users selected boxes with one pointer and colors with a second pointer, or performed both tasks with a single pointer. We found that performance with head and one hand was intermediate between single hand performance and dual hand performance. Our results are consistent with previously reported dual hand conflict in symmetric pointing tasks, and suggest that a head-based input stream should be used for asymmetric control

    An efficient stochastic approach to groupwise non-rigid image registration

    Get PDF
    The groupwise approach to non-rigid image registration, solving the dense correspondence problem, has recently been shown to be a useful tool in many applications, in- cluding medical imaging, automatic construction of statis- tical models of appearance and analysis of facial dynam- ics. Such an approach overcomes limitations of traditional pairwise methods but at a cost of having to search for the solution (optimal registration) in a space of much higher dimensionality which grows rapidly with the number of ex- amples (images) being registered. Techniques to overcome this dimensionality problem have not been addressed suffi- ciently in the groupwise registration literature. In this paper, we propose a novel, fast and reliable, fully unsupervised stochastic algorithm to search for optimal groupwise dense correspondence in large sets of unmarked images. The efficiency of our approach stems from novel di- mensionality reduction techniques specific to the problem of groupwise image registration and from comparative insen- sitivity of the adopted optimisation scheme (Simultaneous Perturbation Stochastic Approximation (SPSA)) to the high dimensionality of the search space. Additionally, our algo- rithm is formulated in way readily suited to implementation on graphics processing units (GPU). In evaluation of our method we show a high robust- ness and success rate, fast convergence on various types of test data, including facial images featuring large degrees of both inter- and intra-person variation, and show consid- erable improvement in terms of accuracy of solution and speed compared to traditional methods
    corecore