Search CORE

897 research outputs found

Application of Just-Noticeable Difference in Quality as Environment Suitability Test for Crowdsourcing Speech Quality Assessment Task

Author: Möller Sebastian
Naderi Babak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2020
Field of study

Crowdsourcing micro-task platforms facilitate subjective media quality assessment by providing access to a highly scale-able, geographically distributed and demographically diverse pool of crowd workers. Those workers participate in the experiment remotely from their own working environment, using their own hardware. In the case of speech quality assessment, preliminary work showed that environmental noise at the listener's side and the listening device (loudspeaker or headphone) significantly affect perceived quality, and consequently the reliability and validity of subjective ratings. As a consequence, ITU-T Rec. P.808 specifies requirements for the listening environment of crowd workers when assessing speech quality. In this paper, we propose a new Just Noticeable Difference of Quality (JNDQ) test as a remote screening method for assessing the suitability of the work environment for participating in speech quality assessment tasks. In a laboratory experiment, participants performed this JNDQ test with different listening devices in different listening environments, including a silent room according to ITU-T Rec. P.800 and a simulated background noise scenario. Results show a significant impact of the environment and the listening device on the JNDQ threshold. Thus, the combination of listening device and background noise needs to be screened in a crowdsourcing speech quality test. We propose a minimum threshold of our JNDQ test as an easily applicable screening method for this purpose.Comment: This paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX

arXiv.org e-Print Archive

A plea for more interactions between psycholinguistics and natural language processing research

Author: Brysbaert Marc
Keuleers Emmanuel
Mandera Pawel
Publication venue
Publication date: 01/01/2014
Field of study

A new development in psycholinguistics is the use of regression analyses on tens of thousands of words, known as the megastudy approach. This development has led to the collection of processing times and subjective ratings (of age of acquisition, concreteness, valence, and arousal) for most of the existing words in English and Dutch. In addition, a crowdsourcing study in the Dutch language has resulted in information about how well 52,000 lemmas are known. This information is likely to be of interest to NLP researchers and computational linguists. At the same time, large-scale measures of word characteristics developed in the latter traditions are likely to be pivotal in bringing the megastudy approach to the next level

Audio Quality Assessment of Vinyl Music Collections Using Self-Supervised Learning

Author: Benetos E
Hines A
ICASSP 2023 - 2023 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
Ragano A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/06/2023
Field of study

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

Author: Arai Kenichi
Araki Shoko
Irino Toshio
Kinoshita Keisuke
Nakatani Tomohiro
Ogawa Atsunori
Yamamoto Ayako
Publication venue
Publication date: 30/03/2022
Field of study

It is essential to perform speech intelligibility (SI) experiments with human listeners to evaluate the effectiveness of objective intelligibility measures. Recently crowdsourced remote testing has become popular to collect a massive amount and variety of data with relatively small cost and in short time. However, careful data screening is essential for attaining reliable SI data. We compared the results of laboratory and crowdsourced remote experiments to establish an effective data screening technique. We evaluated the SI of noisy speech sounds enhanced by a single-channel ideal ratio mask (IRM) and multi-channel mask-based beamformers. The results demonstrated that the SI scores were improved by these enhancement methods. In particular, the IRM-enhanced sounds were much better than the unprocessed and other enhanced sounds, indicating IRM enhancement may give the upper limit of speech enhancement performance. Moreover, tone pip tests, for which participants were asked to report the number of audible tone pips, reduced the variability of crowdsourced remote results so that the laboratory results became similar. Tone pip tests could be useful for future crowdsourced experiments because of their simplicity and effectiveness for data screening.Comment: This paper was submitted to Interspeech 2022 (http://www.interspeech2022.org

arXiv.org e-Print Archive