MONOLINGUAL AND CROSSLINGUAL COMPARISON OF TANDEM FEATURES DERIVED FROM ARTICULATORY AND PHONE MLPS

Arthur Kantor; Chris Bartels; Joe Frankel; Karen Livescu; Mathew Magimai-doss; Simon King; Özgür Çetin

MONOLINGUAL AND CROSSLINGUAL COMPARISON OF TANDEM FEATURES DERIVED FROM ARTICULATORY AND PHONE MLPS

Authors: Arthur Kantor
Chris Bartels
Joe Frankel
Karen Livescu
Mathew Magimai-doss
Simon King
Özgür Çetin
Publication date
Publisher

Abstract

In recent years, the features derived from posteriors of a multilayer perceptron (MLP), known as tandem features, have proven to be very effective for automatic speech recognition. Most tandem features to date have relied on MLPs trained for phone classification. We recently showed on a relatively small data set that MLPs trained for articulatory feature classification can be equally effective. In this paper, we provide a similar comparison using MLPs trained on a much larger data set—2000 hours of English conversational telephone speech. We also explore how portable phone- and articulatory featurebased tandem features are in an entirely different language— Mandarin—without any retraining. We find that while the phone-based features perform slightly better in the matchedlanguage condition, they perform significantly better in the cross-language condition. Yet, in the cross-language condition, neither approach is as effective as the tandem features extracted from an MLP trained on a relatively small amount of in-domain data. Beyond feature concatenation, we also explore novel observation modeling schemes that allow for greater flexibility in combining the tandem and standard features at hidden Markov model (HMM) outputs. Index Terms — Speech recognition, feedforward neural networks, hidden Markov models

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.64.98...

Last time updated on 22/10/2014