Reconstructing intelligible audio speech from visual speech features

Le Cornu, Thomas; Milner, Ben

research

Reconstructing intelligible audio speech from visual speech features

Authors: Thomas Le Cornu
Ben Milner
Publication date: 1 January 2015
Publisher

Abstract

This work describes an investigation into the feasibility of producing intelligible audio speech from only visual speech fea- tures. The proposed method aims to estimate a spectral enve- lope from visual features which is then combined with an arti- ficial excitation signal and used within a model of speech pro- duction to reconstruct an audio signal. Different combinations of audio and visual features are considered, along with both a statistical method of estimation and a deep neural network. The intelligibility of the reconstructed audio speech is measured by human listeners, and then compared to the intelligibility of the video signal only and when combined with the reconstructed audio

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

University of East Anglia digital repository

oai:ueaeprints.uea.ac.uk:56718

Last time updated on 28/06/2016