Channel selection measures for multi-microphone speech recognition

Abstract

Automatic speech recognition in a room with distant microphones is strongly affected by noise and reverberation. In scenarios where the speech signal is captured by several arbitrarily located microphones the degree of distortion differs from one channel to another. In this work we deal with measures extracted from a given distorted signal that either estimate its quality or measure how well it fits the acoustic models of the recognition system. We then apply them to solve the problem of selecting the signal (i.e. the channel) that presumably leads to the lowest recognition error rate. New channel selection techniques are presented, and compared experimentally in reverberant environments with other approaches reported in the literature. Significant improvements in recognition rate are observed for most of the measures. A new measure based on the variance of the speech intensity envelope shows a good trade-off between recognition accuracy, latency and computational cost. Also, the combination of measures allows a further improvement in recognition rate.Peer ReviewedPostprint (published version

Similar works

Full text

thumbnail-image

UPCommons. Portal del coneixement obert de la UPC

redirect
Last time updated on 16/06/2016

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.