2 research outputs found
Transfer Learning for Speech Recognition on a Budget
End-to-end training of automated speech recognition (ASR) systems requires
massive data and compute resources. We explore transfer learning based on model
adaptation as an approach for training ASR models under constrained GPU memory,
throughput and training data. We conduct several systematic experiments
adapting a Wav2Letter convolutional neural network originally trained for
English ASR to the German language. We show that this technique allows faster
training on consumer-grade resources while requiring less training data in
order to achieve the same accuracy, thereby lowering the cost of training ASR
models in other languages. Model introspection revealed that small adaptations
to the network's weights were sufficient for good performance, especially for
inner layers.Comment: Accepted for 2nd ACL Workshop on Representation Learning for NL