2 research outputs found
Application of Just-Noticeable Difference in Quality as Environment Suitability Test for Crowdsourcing Speech Quality Assessment Task
Crowdsourcing micro-task platforms facilitate subjective media quality
assessment by providing access to a highly scale-able, geographically
distributed and demographically diverse pool of crowd workers. Those workers
participate in the experiment remotely from their own working environment,
using their own hardware. In the case of speech quality assessment, preliminary
work showed that environmental noise at the listener's side and the listening
device (loudspeaker or headphone) significantly affect perceived quality, and
consequently the reliability and validity of subjective ratings. As a
consequence, ITU-T Rec. P.808 specifies requirements for the listening
environment of crowd workers when assessing speech quality. In this paper, we
propose a new Just Noticeable Difference of Quality (JNDQ) test as a remote
screening method for assessing the suitability of the work environment for
participating in speech quality assessment tasks. In a laboratory experiment,
participants performed this JNDQ test with different listening devices in
different listening environments, including a silent room according to ITU-T
Rec. P.800 and a simulated background noise scenario. Results show a
significant impact of the environment and the listening device on the JNDQ
threshold. Thus, the combination of listening device and background noise needs
to be screened in a crowdsourcing speech quality test. We propose a minimum
threshold of our JNDQ test as an easily applicable screening method for this
purpose.Comment: This paper has been accepted for publication in the 2020 Twelfth
International Conference on Quality of Multimedia Experience (QoMEX
How reliable are online speech intelligibility studies with known listener cohorts?
Although the use of nontraditional settings for speech perception experiments is growing, there have been few controlled comparisons of online and laboratory modalities in the context of speech intelligibility. The current study compares outcomes from three web-based replications of recent laboratory studies involving distorted, masked, fil- tered, and enhanced speech, amounting to 40 separate conditions. Rather than relying on unrestricted crowdsourcing, this study made use of participants from the population that would normally volunteer to take part physically in labo- ratory experiments. In sentence transcription tasks, the web cohort produced intelligibility scores 3–6 percentage points lower than their laboratory counterparts, and test modality interacted with experimental condition. These disparities and interactions largely disappeared after the exclusion of those web listeners who self-reported the use of low quality headphones, and the remaining listener cohort was also able to replicate key outcomes of each of the three laboratory studies. The laboratory and web modalities produced similar measures of experimental efficiency based on listener variability, response errors, and outlier counts. These findings suggest that the combination of known listener cohorts and moderate headphone quality provides a feasible alternative to traditional laboratory intel- ligibility studies.Basque Government Consolidados programme under Grant No. IT311-1