A Novel Approach to Unsupervised Grapheme-to-Phoneme Conversion

Jerome Bellegarda

A Novel Approach to Unsupervised Grapheme-to-Phoneme Conversion

Authors: Jerome Bellegarda
Publication date
Publisher

Abstract

Automatic, data-driven grapheme-to-phoneme conversion is a challenging but often necessary task. The top-down strategy implicitly adopted by traditional inductive learning techniques tends to dismiss relevant contexts when they have been seen too infrequently in the training data. This paper proposes instead a bottom-up approach which, by design, exhibits better generalization properties. For each out-of-vocabulary word, a neighborhood of locally relevant pronunciations is constructed through latent semantic analysis of the appropriate graphemic form. Phoneme transcription then proceeds via locally optimal sequence alignment and maximum likelihood position scoring. This method was successfully applied to the speech synthesis of proper names with a large diversity of origin

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.6.160...

Last time updated on 22/10/2014