Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition

Ellis, Daniel P. W.; Morgan, Nelson

Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition

Authors: Daniel P. W. Ellis
Nelson Morgan
Publication date: 1 January 1999
Publisher: 'Columbia University Libraries/Information Services'
Doi

Abstract

We have trained and tested a number of large neural networks for the purpose of emission probability estimation in large vocabulary continuous speech recognition. In particular, the problem under test is the DARPA Broadcast News task. Our goal here was to determine the relationship between training time, word error rate, size of the training set, and size of the neural network. In all cases, the network architecture was quite simple, comprising a single large hidden layer with an input window consisting of feature vectors from 9 frames around the current time, with a single output for each of 54 phonetic categories. Thus far, simultaneous increases to the size of the training set and the neural network improve performance; in other words, more data helps, as does the training of more parameters. We continue to be surprised that such a simple system works as well as it does for complex tasks. Given a limitation in training time, however, there appears to be an optimal ratio of training patterns to parameters of around 25:1 in these circumstances. Additionally, doubling the training data and system size appears to provide diminishing returns of error rate reduction for the largest systems

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Sustaining member

Columbia University Academic Commons

oai:academiccommons.columbia.e...

Last time updated on 02/10/2018

Crossref

Last time updated on 20/07/2021