4 research outputs found

    A study on the relationship between the intelligibility and quality of algorithmically-modified speech for normal hearing listeners

    Get PDF
    This study investigates the relationship between the intelligibility and quality of modified speech in noise and in quiet. Speech signals were processed by seven algorithms designed to increase speech intelligibility in noise without altering speech intensity. In three noise maskers, including both stationary and fluctuating noise at two signal-to-noise ratios (SNR), listeners identified keywords from unmodified or modified sentences. The intelligibility performance of each type of speech was measured as the listeners’ word recognition rate in each condition, while the quality was rated as a mean opinion score. In quiet, only the perceptual quality of each type of speech was assessed. The results suggest that when listening in noise, modification performance on improving intelligibility is more important than its potential negative impact on speech quality. However, when listening in quiet or at SNRs in which intelligibility is no longer an issue to listeners, the impact to speech quality due to modification becomes a concern

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Information-preserving temporal reallocation of speech in the presence of fluctuating maskers

    No full text
    How can speech be retimed so as to maximise its intelligibility in the face of competing speech? We present a general strategy which modifies local speech rate to minimise overlap with a known fluctuating masker. Continuous time-scale factors are derived in an optimisation procedure which seeks to minimise overall energetic masking of the speech by the masker while additionally unmasking those speech regions potentially most important for speech recognition. Intelligibility increases are evaluated with both objective and subjective measures and show significant gains over an unmodified baseline, with larger benefits at lower signal-to-noise ratios. The retiming approach does not lead to benefits for speech mixed with stationary maskers, suggesting that the gains observed for the fluctuating masker are not simply due to durational expansion. Index Terms: speech intelligibility, temporal modification, energetic and informal maskin
    corecore