Search CORE

13,911 research outputs found

Videorealistic facial animation for speech-based interfaces

Author: Pueblo Stephen J. (Stephen Jerell)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (p. 79-81).This thesis explores the use of computer-generated, videorealistic facial animation (avatars) in speech-based interfaces to understand whether the use of such animations enhances the end user's experience. Research in spoken dialog systems is a robust area that has now permeated everyday life; most notably with spoken telephone dialog systems. Over the past decade, research with videorealistic animations, both photorealistic and non-photorealistic, has reached the point where there is little discernible difference between the mouth movements of videorealistic animations and the mouth movements of actual humans. Because of the minute differences between the two, videorealistic speech animations are an ideal candidate to use in dialog systems. This thesis presents two videorealistic facial animation systems: a web-based system and a real-time system.by Stephen J. Pueblo.M.Eng

DSpace@MIT

IMAGINE Final Report

Author: Arana C
Dattani I
Pick R
Recio I
Schmidt P
Publication venue: s.n.
Publication date: 01/09/2003
Field of study

Southampton (e-Prints Soton)

The Microsoft 2017 Conversational Speech Recognition System

Author: Alleva F.
Droppo J.
Huang X.
Stolcke A.
Wu L.
Xiong W.
Publication venue
Publication date: 24/08/2017
Field of study

We describe the 2017 version of Microsoft's conversational speech recognition system, in which we update our 2016 system with recent developments in neural-network-based acoustic and language modeling to further advance the state of the art on the Switchboard speech recognition task. The system adds a CNN-BLSTM acoustic model to the set of model architectures we combined previously, and includes character-based and dialog session aware LSTM language models in rescoring. For system combination we adopt a two-stage approach, whereby subsets of acoustic models are first combined at the senone/frame level, followed by a word-level voting via confusion networks. We also added a confusion network rescoring step after system combination. The resulting system yields a 5.1\% word error rate on the 2000 Switchboard evaluation set

arXiv.org e-Print Archive

Crossref

Statistical assessment of speech system performance

Author: Moshier Stephen L.
Publication venue
Publication date
Field of study

Methods for the normalization of performance tests results of speech recognition systems are presented. Technological accomplishments in speech recognition systems, as well as planned research activities are described

NASA Technical Reports Server

Vocal Access to a Newspaper Archive: Design Issues and Preliminary Investigation

Author: Crestani Fabio
Publication venue
Publication date: 10/12/1998
Field of study

This paper presents the design and the current prototype implementation of an interactive vocal Information Retrieval system that can be used to access articles of a large newspaper archive using a telephone. The results of preliminary investigation into the feasibility of such a system are also presented

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU