10 research outputs found

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Speech produced in noise: Relationship between listening difficulty and acoustic and durational parameters.

    Get PDF
    Conversational speech produced in noise can be characterised by increases in intelligibility relative to such speech produced in quiet. Listening difficulty (LD) is a metric that can be used to evaluate speech transmission performance more sensitively than intelligibility scores in situations in which performance is likely to be high. The objectives of the present study were to evaluate the LD of speech produced in different noise and style conditions, to evaluate the spectral and durational speech modifications associated with these conditions, and to determine whether any of the spectral and durational parameters predicted LD. Nineteen subjects were instructed to speak at normal and loud volumes in the presence of background noise at 40.5 dB(A) and babble noise at 61 dB(A). The speech signals were amplitude-normalised, combined with pink noise to obtain a signal-to-noise ratio of -6 dB, and presented to twenty raters who judged their LD. Vowel duration, fundamental frequency and the proportion of the spectral energy in high vs low frequencies increased with the noise level within both styles. LD was lowest when the speech was produced in the presence of high level noise and at a loud volume, indicating improved intelligibility. Spectrum balance was observed to predict LD

    Speech-Based Distance Cueing in Virtual Auditory Displays

    No full text
    corecore