Continual Learning, also known as Lifelong Learning, aims to continually
learn from new data as it becomes available. While prior research on continual
learning in automatic speech recognition has focused on the adaptation of
models across multiple different speech recognition tasks, in this paper we
propose an experimental setting for \textit{online continual learning} for
automatic speech recognition of a single task. Specifically focusing on the
case where additional training data for the same task becomes available
incrementally over time, we demonstrate the effectiveness of performing
incremental model updates to end-to-end speech recognition models with an
online Gradient Episodic Memory (GEM) method. Moreover, we show that with
online continual learning and a selective sampling strategy, we can maintain an
accuracy that is similar to retraining a model from scratch while requiring
significantly lower computation costs. We have also verified our method with
self-supervised learning (SSL) features.Comment: Accepted at InterSpeech 202