Search CORE

509 research outputs found

Recommended from our members

Face Transfer with Multilinear Models

Author: Brand Matthew
Pfister Hanspeter
Popovic Jovan
Vlasic Daniel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/06/2010
Field of study

Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target---the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications.Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to different attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities x expressions x visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters.Engineering and Applied Science

Harvard University - DASH

A multilinear tongue model derived from speech related MRI data of the human vocal tract

Author: Alexander Hewer
Allen
Ananthakrishnan
Badin
Badin
Badin
Baer
Beautemps
Bijar
Blandin
Blanz
Bolkart
Botsch
Brunner
Buchaillard
Buchaillard
Burdumy
De Silva
Demolin
Dryden
Elie
Engwall
Engwall
Engwall
Eryildirim
Fang
Foldvik
Fu
Fuchs
Geng
Harandi
Harandi
Harshman
Harshman
Hewer
Hewer
Honda
Hoole
Hoole
Ingmar Steiner
International Phonetic Association
Jackson
Johnson
Kaburagi
Kiers
Kim
Korin Richmond
Kröger
Ladefoged
Ladefoged
Le Maguer
Lee
Li
Lingala
Lingala
Liu
McGurk
Mermelstein
Narayanan
Narayanan
Narayanan
Niebergall
Otsu
Peng
Raeesy
Richmond
Rodrigues
Rosset
Rudy
Scott
Serrurier
Shadle
Stefanie Wuhrer
Steiner
Stone
Stone
Stone
Styner
Tiede
Toutios
Tucker
Valdés Vargas
Valdés Vargas
Weickert
Weirich
Weirich
Woo
Woo
Wu
Yunusova
Zheng
Publication venue: 'Elsevier BV'
Publication date: 21/02/2018
Field of study

We present a multilinear statistical model of the human tongue that captures anatomical and tongue pose related shape variations separately. The model is derived from 3D magnetic resonance imaging data of 11 speakers sustaining speech related vocal tract configurations. The extraction is performed by using a minimally supervised method that uses as basis an image segmentation approach and a template fitting technique. Furthermore, it uses image denoising to deal with possibly corrupt data, palate surface information reconstruction to handle palatal tongue contacts, and a bootstrap strategy to refine the obtained shapes. Our evaluation concludes that limiting the degrees of freedom for the anatomical and speech related variations to 5 and 4, respectively, produces a model that can reliably register unknown data while avoiding overfitting effects. Furthermore, we show that it can be used to generate a plausible tongue animation by tracking sparse motion capture data

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Tensor Decompositions for Signal Processing Applications From Two-way to Multiway Component Analysis

Author: Caiafa C.
Cichocki A.
De Lathauwer L.
Mandic D.
Phan A-H.
Zhao Q.
Zhou G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/03/2014
Field of study

The widespread use of multi-sensor technology and the emergence of big datasets has highlighted the limitations of standard flat-view matrix models and the necessity to move towards more versatile data analysis tools. We show that higher-order tensors (i.e., multiway arrays) enable such a fundamental paradigm shift towards models that are essentially polynomial and whose uniqueness, unlike the matrix methods, is guaranteed under verymild and natural conditions. Benefiting fromthe power ofmultilinear algebra as theirmathematical backbone, data analysis techniques using tensor decompositions are shown to have great flexibility in the choice of constraints that match data properties, and to find more general latent components in the data than matrix-based methods. A comprehensive introduction to tensor decompositions is provided from a signal processing perspective, starting from the algebraic foundations, via basic Canonical Polyadic and Tucker models, through to advanced cause-effect and multi-view data analysis schemes. We show that tensor decompositions enable natural generalizations of some commonly used signal processing paradigms, such as canonical correlation and subspace techniques, signal separation, linear regression, feature extraction and classification. We also cover computational aspects, and point out how ideas from compressed sensing and scientific computing may be used for addressing the otherwise unmanageable storage and manipulation problems associated with big datasets. The concepts are supported by illustrative real world case studies illuminating the benefits of the tensor framework, as efficient and promising tools for modern signal processing, data analysis and machine learning applications; these benefits also extend to vector/matrix data through tensorization. Keywords: ICA, NMF, CPD, Tucker decomposition, HOSVD, tensor networks, Tensor Train

arXiv.org e-Print Archive

CiteSeerX

Bayesian Robust Tensor Factorization for Incomplete Multiway Data

Author: Amari Shun-ichi
Cichocki Andrzej
Zhang Liqing
Zhao Qibin
Zhou Guoxu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2015
Field of study

We propose a generative model for robust tensor factorization in the presence of both missing data and outliers. The objective is to explicitly infer the underlying low-CP-rank tensor capturing the global information and a sparse tensor capturing the local information (also considered as outliers), thus providing the robust predictive distribution over missing entries. The low-CP-rank tensor is modeled by multilinear interactions between multiple latent factors on which the column sparsity is enforced by a hierarchical prior, while the sparse tensor is modeled by a hierarchical view of Student-

t

distribution that associates an individual hyperparameter with each element independently. For model learning, we develop an efficient closed-form variational inference under a fully Bayesian treatment, which can effectively prevent the overfitting problem and scales linearly with data size. In contrast to existing related works, our method can perform model selection automatically and implicitly without need of tuning parameters. More specifically, it can discover the groundtruth of CP rank and automatically adapt the sparsity inducing priors to various types of outliers. In addition, the tradeoff between the low-rank approximation and the sparse representation can be optimized in the sense of maximum model evidence. The extensive experiments and comparisons with many state-of-the-art algorithms on both synthetic and real-world datasets demonstrate the superiorities of our method from several perspectives.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 201

arXiv.org e-Print Archive

CiteSeerX

Registration and statistical analysis of the tongue shape during speech production

Author: Hewer Alexander
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

This thesis analyzes the human tongue shape during speech production. First, a semi-supervised approach is derived for estimating the tongue shape from volumetric magnetic resonance imaging data of the human vocal tract. Results of this extraction are used to derive parametric tongue models. Next, a framework is presented for registering sparse motion capture data of the tongue by means of such a model. This method allows to generate full three-dimensional animations of the tongue. Finally, a multimodal and statistical text-to-speech system is developed that is able to synthesize audio and synchronized tongue motion from text.Diese Dissertation beschäftigt sich mit der Analyse der menschlichen Zungenform während der Sprachproduktion. Zunächst wird ein semi-überwachtes Verfahren vorgestellt, mit dessen Hilfe sich Zungenformen von volumetrischen Magnetresonanztomographie- Aufnahmen des menschlichen Vokaltrakts schätzen lassen. Die Ergebnisse dieses Extraktionsverfahrens werden genutzt, um ein parametrisches Zungenmodell zu konstruieren. Danach wird eine Methode hergeleitet, die ein solches Modell nutzt, um spärliche Bewegungsaufnahmen der Zunge zu registrieren. Dieser Ansatz erlaubt es, dreidimensionale Animationen der Zunge zu erstellen. Zuletzt wird ein multimodales und statistisches Text-to-Speech-System entwickelt, das in der Lage ist, Audio und die dazu synchrone Zungenbewegung zu synthetisieren.German Research Foundatio

Universaar

Acronym