728 research outputs found
Small-footprint highway deep neural networks for speech recognition
State-of-the-art speech recognition systems typically employ neural network
acoustic models. However, compared to Gaussian mixture models, deep neural
network (DNN) based acoustic models often have many more model parameters,
making it challenging for them to be deployed on resource-constrained
platforms, such as mobile devices. In this paper, we study the application of
the recently proposed highway deep neural network (HDNN) for training
small-footprint acoustic models. HDNNs are a depth-gated feedforward neural
network, which include two types of gate functions to facilitate the
information flow through different layers. Our study demonstrates that HDNNs
are more compact than regular DNNs for acoustic modeling, i.e., they can
achieve comparable recognition accuracy with many fewer model parameters.
Furthermore, HDNNs are more controllable than DNNs: the gate functions of an
HDNN can control the behavior of the whole network using a very small number of
model parameters. Finally, we show that HDNNs are more adaptable than DNNs. For
example, simply updating the gate functions using adaptation data can result in
considerable gains in accuracy. We demonstrate these aspects by experiments
using the publicly available AMI corpus, which has around 80 hours of training
data.Comment: 9 pages, 6 figures. Accepted to IEEE/ACM Transactions on Audio,
Speech and Language Processing, 2017. arXiv admin note: text overlap with
arXiv:1608.00892, arXiv:1607.0196
Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition
For speech recognition, deep neural networks (DNNs) have significantly
improved the recognition accuracy in most of benchmark datasets and application
domains. However, compared to the conventional Gaussian mixture models,
DNN-based acoustic models usually have much larger number of model parameters,
making it challenging for their applications in resource constrained platforms,
e.g., mobile devices. In this paper, we study the application of the recently
proposed highway network to train small-footprint DNNs, which are {\it thinner}
and {\it deeper}, and have significantly smaller number of model parameters
compared to conventional DNNs. We investigated this approach on the AMI meeting
speech transcription corpus which has around 70 hours of audio data. The
highway neural networks constantly outperformed their plain DNN counterparts,
and the number of model parameters can be reduced significantly without
sacrificing the recognition accuracy.Comment: 5 pages, 3 figures, fixed typo, accepted by Interspeech 201
- …