Automatic speech recognition systems based on deep learning are mainly
trained under empirical risk minimization (ERM). Since ERM utilizes the
averaged performance on the data samples regardless of a group such as healthy
or dysarthric speakers, ASR systems are unaware of the performance disparities
across the groups. This results in biased ASR systems whose performance
differences among groups are severe. In this study, we aim to improve the ASR
system in terms of group robustness for dysarthric speakers. To achieve our
goal, we present a novel approach, sample reweighting with sample affinity test
(Re-SAT). Re-SAT systematically measures the debiasing helpfulness of the given
data sample and then mitigates the bias by debiasing helpfulness-based sample
reweighting. Experimental results demonstrate that Re-SAT contributes to
improved ASR performance on dysarthric speech without performance degradation
on healthy speech.Comment: Accepted by Interspeech 202