Skip to main content
Article thumbnail
Location of Repository

Acoustic model adaptation via linear spline interpolation for robust speech recognition

By Michael L. Seltzer, Alex Acero and Kaustubh Kalgaonkar

Abstract

We recently proposed a new algorithm to perform acoustic model adaptation to noisy environments called Linear Spline Interpolation (LSI). In this method, the nonlinear relationship between clean and noisy speech features is modeled using linear spline regression. Linear spline parameters that minimize the error the between the predicted noisy features and the actual noisy features are learned from training data. A variance associated with each spline segment captures the uncertainty in the assumed model. In this work, we extend the LSI algorithm in two ways. First, the adaptation scheme is extended to compensate for the presence of linear channel distortion. Second, we show how the noise and channel parameters can be updated during decoding in an unsupervised manner within the LSI framework. Using LSI, we obtain an average relative improvement in word error rate of 10.8 % over VTS adaptation on the Aurora 2 task with improvements of 15-18 % at SNRs between 10 and 15 dB. Index Terms — robust speech recognition, model adaptation 1

Year: 2010
OAI identifier: oai:CiteSeerX.psu:10.1.1.187.8208
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://research.microsoft.com/... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.