177 research outputs found

    Speech intelligibility in multilingual spaces

    Get PDF
    This thesis examines speech intelligibility and multi-lingual communication, in terms of acoustics and perceptual factors. More specifically, the work focused on the impact of room acoustic conditions on the speech intelligibility of four languages representative of a wide range of linguistic properties (English, Polish, Arabic and Mandarin). Firstly, diagnostic rhyme tests (DRT), phonemically balanced (PB) word lists and phonemically balanced sentence lists have been compared under four room acoustic conditions defined by their speech transmission index (STI = 0.2, 0.4, 0.6 and 0.8). The results obtained indicated that there was a statistically significant difference between the word intelligibility scores of languages under all room acoustic conditions, apart from the STI = 0.8 condition. English was the most intelligible language under all conditions, and differences with other languages were larger when conditions were poor (maximum difference of 29% at STI = 0.2, 33% at STI = 0.4 and 14% at STI = 0.6). Results also showed that Arabic and Polish were particularly sensitive to background noise, and that Mandarin was significantly more intelligible than those languages at STI = 0.4. Consonant-to-vowel ratios and languages’ distinctive features and acoustical properties explained some of the scores obtained. Sentence intelligibility scores confirmed variations between languages, but these variations were statistically significant only at the STI = 0.4 condition (sentence tests being less sensitive to very good and very poor room acoustic conditions). Additionally, perceived speech intelligibility and soundscape perception associated to these languages was also analysed in three multi-lingual environments: an airport check-in area, a hospital reception area, and a café. Semantic differential analysis showed that perceived speech intelligibility of each language varies with the type of environment, as well as the type of background noise, reverberation time, and signal-to-noise ratio. Variations between the perceived speech intelligibility of the four languages were only marginally significant (p = 0.051), unlike objective intelligibility results. Perceived speech intelligibility of English appeared to be mostly affected negatively by the information content and distracting sounds present in the background noise. Lastly, the study investigated several standards and design guidelines and showed how adjustments could be made to recommended STI values in order to achieve consistent speech intelligibility ratings across languages

    EVALUATION OF THE SIGNAL-TO-NOISE RATIO REQUIRED TO ACHIEVE THE SAME PERFORMANCE IN ENGLISH AND MANDARIN CHINESE

    Get PDF
    Difficulty communicating in noise is a common complaint for people with hearing loss. When communicating in noise, speakers increase the intensity level of their voice and alter the stress patterns of their speech not only to monitor their own voice but also to be heard by others. Speech that increases in intensity for the purpose of self-monitoring and being understood in noise is called Lombard speech. Few studies have assessed communication performance with Lombard speech in noise which closely reflects the real-life communication situation. In addition, the characteristics of Lombard speech may be different(among) languages with different characteristics and identifying features so the few results available for English listeners may not apply to listeners of other languages. This study evaluated the performance of English speaking and Mandarin Chinese speaking individuals listening to English and Mandarin Chinese speech in corresponding babble noise. Speech materials were the IEEE sentences in English and translated into Mandarin Chinese while controlling for phonological, grammatical, and contextual predictability. The sentences and 4-talker babble were recorded in a conversational manner and at a Lombard speech level produced while listening to 80 dB SPL of noise. The performance of 18 native English speakers and 18 native Mandarin Chinese speakers was evaluated. The SNR-50, the signal-to-noise level required to produce 50% performance, was the same for conversational and Lombard English indicating that there is not a particular benefit in producing Lombard speech to be understood. The reason to produce Lombard speech in English is to improve the signal-to-noise ratio in order to facilitate improved communication. The results for the Mandarin Chinese listeners revealed a benefit when producing Lombard speech with the SNR-50 for Mandarin Chinese significantly different between conversational and Lombard speech. In noisy situations where increasing vocal intensity is expected, , Mandarin Chinese listeners appear to benefit from features preserved or enhanced through Lombard speech that English listeners do not access

    Development of Bisyllabic Speech Audiometry Word Lists for Adult Malay Speakers

    Get PDF
    Standardised speech audiometry material is essential in assessing hearing for speech; however, material in Malay language, particularly for speech reception threshold test, is limited and not thoroughly validated. This thesis examines the development of standardised, phonemically-balanced bisyllabic Malay speech reception threshold (SRT) test word lists for Malay-speaking adults. The effect of having a mixture of familiar and nonsense words on speech recognition is also explored. The processes of developing the word lists include selecting and compiling the words using content analysis research method, testing for homogeneity and consistency and validating the acoustic content, both using correlational research method, and assessing the clinical validity using concurrent validity method. The familiar words were selected from a corpus of familiar words extracted from daily newspapers while the nonsense words were formed based on linguistic properties of Malay. The preliminary set consisted of fifteen lists with 10 familiar words and 5 nonsense words in each. The analyses of the findings show consistency of speech discrimination using the word lists using Friedman test to have statistically no significant difference in correct scores achieved using any of the word lists, Χ2 = 19.584, p>0.05. Homogeneity test for all lists using Cronbach’s alpha showed a value of 0.78, indicating a strong agreement and good homogeneity among the lists. When five lists with inter-item correlation ≤0.8 were excluded from the homogeneity analysis, the alpha value for the remaining 10 lists increased to 0.88. Consistency analysis of acoustic content using repeated measures ANOVA showed no significant difference between the list and the LTASS, F=1.229, p>0.05. All 15 lists were then tested for clinical validity. Two versions of list content were assessed, an all-words version (AWL) containing all 15 words each list, and a meaningful-words only version (MWL) containing 10 meaningful words for each list. Correlation analyses between half peak level (HPL) of the speech recognition curve and pure tone (PT) thresholds showed that, in consideration of both normal hearing and hearing impaired listeners, the HPL correlated best with PT average of 250, 500, 1000, 2000 and 4000 Hz for both AWL (r = 0.67 to 0.95) and MWL (r = 0.65 to 0.95). A comparison between HPL and PT average of 250, 500, 1000, 2000 and 4000 Hz showed mean differences of 4 dB (SD = 3) and 3 dB (SD = 4) with the range of tolerance (95% confidence) of ±7 dB and ±8 dB for AWL and MWL respectively. Sensitivity, specificity, and positive and negative predictive values, when set at tolerance level of ±10 dB, were mostly >0.90 for normal hearing and hearing loss listeners using either versions. It was concluded that the addition of nonsense words does not significantly affect SRT. The correlation between the SRT obtained using the bisyllabic Malay word lists and the PT thresholds suggested that the word lists were robust enough to be used in assessing speech hearing clinically. In conclusion, the current study has achieved to develop and produce a standardised, phonemically balanced bisyllabic Malay speech audiometry (BMSA) word lists for assessing speech reception threshold and discrimination in adult Malay speakers

    A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

    Get PDF
    The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters

    Tone production ability in Cantonese-speaking hearing- impaired children with cochlear implants or hearing aids

    Get PDF
    A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2007.Thesis (B.Sc)--University of Hong Kong, 2007.Also available in print.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Improving hearing ability in challenging conditions

    Get PDF
    Although speech recognition using hearing aids and cochlear implants has improved significantly recently, most people with hearing impairment still have difficulty understanding speech in noisy environments. Improving the ability of the brain to learn how to make full use of prosthetic devices is as important as developments in the technology. Auditory perceptual training helps people to be more sensitive to target sounds. Therefore, auditory training programmes have the potential to optimise the performance of hearingimpaired users and help them get more benefit from their prosthetic devices. Better understanding of how and when auditory perceptual training generalises with normal hearing people may help in devising better training for people with hearing impairment. However, in literature, researchers have mainly focused on changing the target stimuli using amplitude modulated sounds or speech stimuli. Fewer researchers have explored the auditory learning and generalization effect of changing the background noise. It is not clear whether training generalizes to other types of noise, and in particular real-world environmental noises. A novel element of this study is that it focuses on auditory training of people to pick up the target stimuli by changing the background noise. This project was divided into four stages. The first stage of this work looked at basic detection thresholds for amplitude modulation (AM) in sound stimuli, and found that training with AM-detection did not generalize to AM-rate discrimination, regardless of the modulation depths. For the second stage, two nonsense stimuli (Vowel Consonant Vowel VCV) training studies were carried out to explore auditory perceptual learning patterns with nonsense syllables across fixed and random background noise. It was motivated by visual research which showed that people can improve their detection performance by learning to ignore constant visual noise and that this skill transfers to new, random visual noise. Results showed that learning with random noise produced better identification performance than with fixed noise. There was no generalization from fixed noise training to random noise environments. These results were in contrast to the visual learning studies. Followed by the second stage, a short single session VCV study was conducted to investigate whether nonsense syllable adaption to fixed noise was different to random noise. Results showed that listeners’ VCV identification was similar for fixed and random babble noise conditions. This was different from stage two that showed better nonsense recognition with random noise training than with fixed noise training. It is suggested that test method differences (multi-sessions vs single session) lead to performance differences between fixed and random noise conditions. The final stage of this work was to explore whether any learning effect from training with speech in random babble noise generalized to other environmental noises, such as car and rain. Results demonstrated that speech in babble noise training generalized to car and rain noise conditions, and part of the learning effect from speech in babble noise was sustained after several weeks. This project investigated auditory perceptual learning performance of normal hearing people using AM stimuli, nonsense speech, and speech with various types of background noise (babble, car, rain). The learning outcomes provide important evidence for the use of background noises (fixed noise, random noise, and real-world environmental noises) in auditory perceptual training programmes, which can help to build up clinical guidelines for training people with hearing impairment to improve their hearing in challenging conditions

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Designs of Speech Audiometric Tests in Vietnamese – The Issues of Normative Values, Dialectal Effects, and Tonal Patterns

    Get PDF
    Dialectal variations and linguistic factors are considered to be the primary causes of misdiagnosis during audiological assessments of speech performances. For new speech audiometry materials, the evaluation of the effects of the listener’s dialect or linguistic factors on speech recognition thresholds (SRTs) or supra-threshold phoneme recognition scores (PRSs) would be expected to give a valid and reliable audiometric measurement for clients. This thesis assessed the SRTs of native and non-native listeners of Southern Vietnamese regarding the dialectal effects; the effect of tonal patterns of syllables on speech perception of older adults; and the correlations between SRTs and duo-tone thresholds, between SRTs and PRSs. To attain the aforementioned objectives, two different types of speech audiometry materials were designed: Adaptive Auditory Speech Test (AAST) and NAMES (nonsense syllable test). Data of AAST were collected from 435 normal hearing listeners aged between four and 85 years old, while data of NAMES were gathered from 186 normal hearing listeners ranged between 15 and 85 years. The findings showed that the AAST and NAMES are valid speech audiometric tests to quantify speech recognition of listeners aged between four and 85 (AAST), between 15 and 85 (NAMES). The age-related normative values of AAST in Vietnamese are similar to those in German, Ghanaian, and Polish. The findings of the dialectal study indicate that dialectal variation has an impact on speech recognition. However, the extent of the effects depends on the speech materials being used for the measurement. More effects of dialectal differences in “open speech tests with meaningful words” were found as compared to “closed speech test”. The findings on tonal pattern effects seem to implicate that the tonal patterns of syllables have a minor influence on speech perception of older adults, especially those above 75. Finally, the SRTs could be predicted using duo-tone thresholds. In contrast, the PRSs could not be predicted using either speech thresholds or duo-tone thresholds based on the correlations. The two-new speech audiometric tests provide reliable outcomes with the same properties in normal-hearing listeners as compared to the other AAST and nonsense syllable tests in the different languages. These two speech audiometric tests complement each other in evaluating hearing loss or language impairment. It is claimed that these speech tests will serve as an effective clinical tool for speech audiometric testing in Vietnam

    Non-native listeners' recognition of high-variability speech using PRESTO

    Get PDF
    BACKGROUND: Natural variability in speech is a significant challenge to robust successful spoken word recognition. In everyday listening environments, listeners must quickly adapt and adjust to multiple sources of variability in both the signal and listening environments. High-variability speech may be particularly difficult to understand for non-native listeners, who have less experience with the second language (L2) phonological system and less detailed knowledge of sociolinguistic variation of the L2. PURPOSE: The purpose of this study was to investigate the effects of high-variability sentences on non-native speech recognition and to explore the underlying sources of individual differences in speech recognition abilities of non-native listeners. RESEARCH DESIGN: Participants completed two sentence recognition tasks involving high-variability and low-variability sentences. They also completed a battery of behavioral tasks and self-report questionnaires designed to assess their indexical processing skills, vocabulary knowledge, and several core neurocognitive abilities. STUDY SAMPLE: Native speakers of Mandarin (n = 25) living in the United States recruited from the Indiana University community participated in the current study. A native comparison group consisted of scores obtained from native speakers of English (n = 21) in the Indiana University community taken from an earlier study. DATA COLLECTION AND ANALYSIS: Speech recognition in high-variability listening conditions was assessed with a sentence recognition task using sentences from PRESTO (Perceptually Robust English Sentence Test Open-Set) mixed in 6-talker multitalker babble. Speech recognition in low-variability listening conditions was assessed using sentences from HINT (Hearing In Noise Test) mixed in 6-talker multitalker babble. Indexical processing skills were measured using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Vocabulary knowledge was assessed with the WordFam word familiarity test, and executive functioning was assessed with the BRIEF-A (Behavioral Rating Inventory of Executive Function - Adult Version) self-report questionnaire. Scores from the non-native listeners on behavioral tasks and self-report questionnaires were compared with scores obtained from native listeners tested in a previous study and were examined for individual differences. RESULTS: Non-native keyword recognition scores were significantly lower on PRESTO sentences than on HINT sentences. Non-native listeners' keyword recognition scores were also lower than native listeners' scores on both sentence recognition tasks. Differences in performance on the sentence recognition tasks between non-native and native listeners were larger on PRESTO than on HINT, although group differences varied by signal-to-noise ratio. The non-native and native groups also differed in the ability to categorize talkers by region of origin and in vocabulary knowledge. Individual non-native word recognition accuracy on PRESTO sentences in multitalker babble at more favorable signal-to-noise ratios was found to be related to several BRIEF-A subscales and composite scores. However, non-native performance on PRESTO was not related to regional dialect categorization, talker and gender discrimination, or vocabulary knowledge. CONCLUSIONS: High-variability sentences in multitalker babble were particularly challenging for non-native listeners. Difficulty under high-variability testing conditions was related to lack of experience with the L2, especially L2 sociolinguistic information, compared with native listeners. Individual differences among the non-native listeners were related to weaknesses in core neurocognitive abilities affecting behavioral control in everyday life
    corecore