Subjective intelligibility of speech sounds enhanced by ideal ratio mask
  via crowdsourced remote experiments with effective data screening

Arai, Kenichi; Araki, Shoko; Irino, Toshio; Kinoshita, Keisuke; Nakatani, Tomohiro; Ogawa, Atsunori; Yamamoto, Ayako

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

Authors: Kenichi Arai
Shoko Araki
Toshio Irino
Keisuke Kinoshita
Tomohiro Nakatani
Atsunori Ogawa
Ayako Yamamoto
Publication date: 30 March 2022
Publisher

Abstract

It is essential to perform speech intelligibility (SI) experiments with human listeners to evaluate the effectiveness of objective intelligibility measures. Recently crowdsourced remote testing has become popular to collect a massive amount and variety of data with relatively small cost and in short time. However, careful data screening is essential for attaining reliable SI data. We compared the results of laboratory and crowdsourced remote experiments to establish an effective data screening technique. We evaluated the SI of noisy speech sounds enhanced by a single-channel ideal ratio mask (IRM) and multi-channel mask-based beamformers. The results demonstrated that the SI scores were improved by these enhancement methods. In particular, the IRM-enhanced sounds were much better than the unprocessed and other enhanced sounds, indicating IRM enhancement may give the upper limit of speech enhancement performance. Moreover, tone pip tests, for which participants were asked to report the number of audible tone pips, reduced the variability of crowdsourced remote results so that the laboratory results became similar. Tone pip tests could be useful for future crowdsourced experiments because of their simplicity and effectiveness for data screening.Comment: This paper was submitted to Interspeech 2022 (http://www.interspeech2022.org

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2203.16760

Last time updated on 24/04/2022