Search CORE

177 research outputs found

Speech intelligibility in multilingual spaces

Author: Kitapci Kivanc
Publication venue: Energy, Geoscience, Infrastructure and Society
Publication date: 01/08/2016
Field of study

This thesis examines speech intelligibility and multi-lingual communication, in terms of acoustics and perceptual factors. More specifically, the work focused on the impact of room acoustic conditions on the speech intelligibility of four languages representative of a wide range of linguistic properties (English, Polish, Arabic and Mandarin). Firstly, diagnostic rhyme tests (DRT), phonemically balanced (PB) word lists and phonemically balanced sentence lists have been compared under four room acoustic conditions defined by their speech transmission index (STI = 0.2, 0.4, 0.6 and 0.8). The results obtained indicated that there was a statistically significant difference between the word intelligibility scores of languages under all room acoustic conditions, apart from the STI = 0.8 condition. English was the most intelligible language under all conditions, and differences with other languages were larger when conditions were poor (maximum difference of 29% at STI = 0.2, 33% at STI = 0.4 and 14% at STI = 0.6). Results also showed that Arabic and Polish were particularly sensitive to background noise, and that Mandarin was significantly more intelligible than those languages at STI = 0.4. Consonant-to-vowel ratios and languages’ distinctive features and acoustical properties explained some of the scores obtained. Sentence intelligibility scores confirmed variations between languages, but these variations were statistically significant only at the STI = 0.4 condition (sentence tests being less sensitive to very good and very poor room acoustic conditions). Additionally, perceived speech intelligibility and soundscape perception associated to these languages was also analysed in three multi-lingual environments: an airport check-in area, a hospital reception area, and a café. Semantic differential analysis showed that perceived speech intelligibility of each language varies with the type of environment, as well as the type of background noise, reverberation time, and signal-to-noise ratio. Variations between the perceived speech intelligibility of the four languages were only marginally significant (p = 0.051), unlike objective intelligibility results. Perceived speech intelligibility of English appeared to be mostly affected negatively by the information content and distracting sounds present in the background noise. Lastly, the study investigated several standards and design guidelines and showed how adjustments could be made to recommended STI values in order to achieve consistent speech intelligibility ratings across languages

ROS: The Research Output Service. Heriot-Watt University Edinburgh

EVALUATION OF THE SIGNAL-TO-NOISE RATIO REQUIRED TO ACHIEVE THE SAME PERFORMANCE IN ENGLISH AND MANDARIN CHINESE

Author: Ye Yi
Publication venue
Publication date: 10/09/2014
Field of study

Difficulty communicating in noise is a common complaint for people with hearing loss. When communicating in noise, speakers increase the intensity level of their voice and alter the stress patterns of their speech not only to monitor their own voice but also to be heard by others. Speech that increases in intensity for the purpose of self-monitoring and being understood in noise is called Lombard speech. Few studies have assessed communication performance with Lombard speech in noise which closely reflects the real-life communication situation. In addition, the characteristics of Lombard speech may be different(among) languages with different characteristics and identifying features so the few results available for English listeners may not apply to listeners of other languages. This study evaluated the performance of English speaking and Mandarin Chinese speaking individuals listening to English and Mandarin Chinese speech in corresponding babble noise. Speech materials were the IEEE sentences in English and translated into Mandarin Chinese while controlling for phonological, grammatical, and contextual predictability. The sentences and 4-talker babble were recorded in a conversational manner and at a Lombard speech level produced while listening to 80 dB SPL of noise. The performance of 18 native English speakers and 18 native Mandarin Chinese speakers was evaluated. The SNR-50, the signal-to-noise level required to produce 50% performance, was the same for conversational and Lombard English indicating that there is not a particular benefit in producing Lombard speech to be understood. The reason to produce Lombard speech in English is to improve the signal-to-noise ratio in order to facilitate improved communication. The results for the Mandarin Chinese listeners revealed a benefit when producing Lombard speech with the SNR-50 for Mandarin Chinese significantly different between conversational and Lombard speech. In noisy situations where increasing vocal intensity is expected, , Mandarin Chinese listeners appear to benefit from features preserved or enhanced through Lombard speech that English listeners do not access

D-Scholarship@Pitt

Development of Bisyllabic Speech Audiometry Word Lists for Adult Malay Speakers

Author: Alisaputri Marina
Publication venue: Faculty of Health and Life Sciences
Publication date: 01/06/2016
Field of study

Standardised speech audiometry material is essential in assessing hearing for speech; however, material in Malay language, particularly for speech reception threshold test, is limited and not thoroughly validated. This thesis examines the development of standardised, phonemically-balanced bisyllabic Malay speech reception threshold (SRT) test word lists for Malay-speaking adults. The effect of having a mixture of familiar and nonsense words on speech recognition is also explored. The processes of developing the word lists include selecting and compiling the words using content analysis research method, testing for homogeneity and consistency and validating the acoustic content, both using correlational research method, and assessing the clinical validity using concurrent validity method. The familiar words were selected from a corpus of familiar words extracted from daily newspapers while the nonsense words were formed based on linguistic properties of Malay. The preliminary set consisted of fifteen lists with 10 familiar words and 5 nonsense words in each. The analyses of the findings show consistency of speech discrimination using the word lists using Friedman test to have statistically no significant difference in correct scores achieved using any of the word lists, Χ2 = 19.584, p>0.05. Homogeneity test for all lists using Cronbach’s alpha showed a value of 0.78, indicating a strong agreement and good homogeneity among the lists. When five lists with inter-item correlation ≤0.8 were excluded from the homogeneity analysis, the alpha value for the remaining 10 lists increased to 0.88. Consistency analysis of acoustic content using repeated measures ANOVA showed no significant difference between the list and the LTASS, F=1.229, p>0.05. All 15 lists were then tested for clinical validity. Two versions of list content were assessed, an all-words version (AWL) containing all 15 words each list, and a meaningful-words only version (MWL) containing 10 meaningful words for each list. Correlation analyses between half peak level (HPL) of the speech recognition curve and pure tone (PT) thresholds showed that, in consideration of both normal hearing and hearing impaired listeners, the HPL correlated best with PT average of 250, 500, 1000, 2000 and 4000 Hz for both AWL (r = 0.67 to 0.95) and MWL (r = 0.65 to 0.95). A comparison between HPL and PT average of 250, 500, 1000, 2000 and 4000 Hz showed mean differences of 4 dB (SD = 3) and 3 dB (SD = 4) with the range of tolerance (95% confidence) of ±7 dB and ±8 dB for AWL and MWL respectively. Sensitivity, specificity, and positive and negative predictive values, when set at tolerance level of ±10 dB, were mostly >0.90 for normal hearing and hearing loss listeners using either versions. It was concluded that the addition of nonsense words does not significantly affect SRT. The correlation between the SRT obtained using the bisyllabic Malay word lists and the PT thresholds suggested that the word lists were robust enough to be used in assessing speech hearing clinically. In conclusion, the current study has achieved to develop and produce a standardised, phonemically balanced bisyllabic Malay speech audiometry (BMSA) word lists for assessing speech reception threshold and discrimination in adult Malay speakers

De Montfort University Open Research Archive

A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

Author: Avesani Cinzia
Baker Brett Joseph
Balling Laura Winther
Behne Dawn M.
Best Catherine
Bundgaard-Nielsen Rikke
Carlet Angélica
Cebrian Juli
Christensen Ken Ramshøj
Cooper Angela
Flege James Emil
Hejná Michaela
Hejná Mísa
Horslund Camilla Søballe
Hua Congehao
Højen Anders
Højen Anders
Jespersen Anna
Jespersen Anna Bothe
Jongman Allard
Jørgensen Henrik
Karmeli Sophia
Kizach Johannes
Kluge Denise Cristina
Lee Goun
Li Bin
Li Yingjie
Masapollo Matthew
Mooshammer Christine
Mora Joan C.
Mora-Plaza Ingrid
Niebuhr Oliver
Nyvad Anne Mette
Nyvad Anne Mette
Piske Thorsten
Polka Linda
Rasmussen Sidsel
Ruan Yufang
Sereno Joan A.
Steinlen Anja
Sørensen Mette Hjortshøj
Sørensen Mette Hjortshøj
Tyler Michael
Vayra Mario
Vikner Sten
Wang Yue
Wayland Ratree
Whalen D. H.
Wood Johanna
Yan Mengzhu
Publication venue: 'Aarhus University Library'
Publication date: 16/05/2019
Field of study

The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters

AU Library Scholarly Publishing Services: E-books (Aarhus University)

Tone production ability in Cantonese-speaking hearing- impaired children with cochlear implants or hearing aids

Author: Lee Kit-ching, Angela
Publication venue: The University of Hong Kong (Pokfulam, Hong Kong)
Publication date: 01/01/2007
Field of study

A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2007.Thesis (B.Sc)--University of Hong Kong, 2007.Also available in print.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

HKU Scholars Hub

Improving hearing ability in challenging conditions

Author: Zhang Liping
Publication venue
Publication date
Field of study

Although speech recognition using hearing aids and cochlear implants has improved significantly recently, most people with hearing impairment still have difficulty understanding speech in noisy environments. Improving the ability of the brain to learn how to make full use of prosthetic devices is as important as developments in the technology. Auditory perceptual training helps people to be more sensitive to target sounds. Therefore, auditory training programmes have the potential to optimise the performance of hearingimpaired users and help them get more benefit from their prosthetic devices. Better understanding of how and when auditory perceptual training generalises with normal hearing people may help in devising better training for people with hearing impairment. However, in literature, researchers have mainly focused on changing the target stimuli using amplitude modulated sounds or speech stimuli. Fewer researchers have explored the auditory learning and generalization effect of changing the background noise. It is not clear whether training generalizes to other types of noise, and in particular real-world environmental noises. A novel element of this study is that it focuses on auditory training of people to pick up the target stimuli by changing the background noise. This project was divided into four stages. The first stage of this work looked at basic detection thresholds for amplitude modulation (AM) in sound stimuli, and found that training with AM-detection did not generalize to AM-rate discrimination, regardless of the modulation depths. For the second stage, two nonsense stimuli (Vowel Consonant Vowel VCV) training studies were carried out to explore auditory perceptual learning patterns with nonsense syllables across fixed and random background noise. It was motivated by visual research which showed that people can improve their detection performance by learning to ignore constant visual noise and that this skill transfers to new, random visual noise. Results showed that learning with random noise produced better identification performance than with fixed noise. There was no generalization from fixed noise training to random noise environments. These results were in contrast to the visual learning studies. Followed by the second stage, a short single session VCV study was conducted to investigate whether nonsense syllable adaption to fixed noise was different to random noise. Results showed that listeners’ VCV identification was similar for fixed and random babble noise conditions. This was different from stage two that showed better nonsense recognition with random noise training than with fixed noise training. It is suggested that test method differences (multi-sessions vs single session) lead to performance differences between fixed and random noise conditions. The final stage of this work was to explore whether any learning effect from training with speech in random babble noise generalized to other environmental noises, such as car and rain. Results demonstrated that speech in babble noise training generalized to car and rain noise conditions, and part of the learning effect from speech in babble noise was sustained after several weeks. This project investigated auditory perceptual learning performance of normal hearing people using AM stimuli, nonsense speech, and speech with various types of background noise (babble, car, rain). The learning outcomes provide important evidence for the use of background noises (fixed noise, random noise, and real-world environmental noises) in auditory perceptual training programmes, which can help to build up clinical guidelines for training people with hearing impairment to improve their hearing in challenging conditions

Warwick Research Archives Portal Repository

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

Designs of Speech Audiometric Tests in Vietnamese – The Issues of Normative Values, Dialectal Effects, and Tonal Patterns

Author: Nguyen Quoc-Dung
Publication venue
Publication date: 18/05/2017
Field of study

Dialectal variations and linguistic factors are considered to be the primary causes of misdiagnosis during audiological assessments of speech performances. For new speech audiometry materials, the evaluation of the effects of the listener’s dialect or linguistic factors on speech recognition thresholds (SRTs) or supra-threshold phoneme recognition scores (PRSs) would be expected to give a valid and reliable audiometric measurement for clients. This thesis assessed the SRTs of native and non-native listeners of Southern Vietnamese regarding the dialectal effects; the effect of tonal patterns of syllables on speech perception of older adults; and the correlations between SRTs and duo-tone thresholds, between SRTs and PRSs. To attain the aforementioned objectives, two different types of speech audiometry materials were designed: Adaptive Auditory Speech Test (AAST) and NAMES (nonsense syllable test). Data of AAST were collected from 435 normal hearing listeners aged between four and 85 years old, while data of NAMES were gathered from 186 normal hearing listeners ranged between 15 and 85 years. The findings showed that the AAST and NAMES are valid speech audiometric tests to quantify speech recognition of listeners aged between four and 85 (AAST), between 15 and 85 (NAMES). The age-related normative values of AAST in Vietnamese are similar to those in German, Ghanaian, and Polish. The findings of the dialectal study indicate that dialectal variation has an impact on speech recognition. However, the extent of the effects depends on the speech materials being used for the measurement. More effects of dialectal differences in “open speech tests with meaningful words” were found as compared to “closed speech test”. The findings on tonal pattern effects seem to implicate that the tonal patterns of syllables have a minor influence on speech perception of older adults, especially those above 75. Finally, the SRTs could be predicted using duo-tone thresholds. In contrast, the PRSs could not be predicted using either speech thresholds or duo-tone thresholds based on the correlations. The two-new speech audiometric tests provide reliable outcomes with the same properties in normal-hearing listeners as compared to the other AAST and nonsense syllable tests in the different languages. These two speech audiometric tests complement each other in evaluating hearing loss or language impairment. It is claimed that these speech tests will serve as an effective clinical tool for speech audiometric testing in Vietnam

Kölner UniversitätsPublikationsServer

Non-native listeners' recognition of high-variability speech using PRESTO

Author: Pisoni David B.
Tamati Terrin N.
Publication venue: 'American Academy of Audiology'
Publication date: 01/10/2014
Field of study

BACKGROUND: Natural variability in speech is a significant challenge to robust successful spoken word recognition. In everyday listening environments, listeners must quickly adapt and adjust to multiple sources of variability in both the signal and listening environments. High-variability speech may be particularly difficult to understand for non-native listeners, who have less experience with the second language (L2) phonological system and less detailed knowledge of sociolinguistic variation of the L2. PURPOSE: The purpose of this study was to investigate the effects of high-variability sentences on non-native speech recognition and to explore the underlying sources of individual differences in speech recognition abilities of non-native listeners. RESEARCH DESIGN: Participants completed two sentence recognition tasks involving high-variability and low-variability sentences. They also completed a battery of behavioral tasks and self-report questionnaires designed to assess their indexical processing skills, vocabulary knowledge, and several core neurocognitive abilities. STUDY SAMPLE: Native speakers of Mandarin (n = 25) living in the United States recruited from the Indiana University community participated in the current study. A native comparison group consisted of scores obtained from native speakers of English (n = 21) in the Indiana University community taken from an earlier study. DATA COLLECTION AND ANALYSIS: Speech recognition in high-variability listening conditions was assessed with a sentence recognition task using sentences from PRESTO (Perceptually Robust English Sentence Test Open-Set) mixed in 6-talker multitalker babble. Speech recognition in low-variability listening conditions was assessed using sentences from HINT (Hearing In Noise Test) mixed in 6-talker multitalker babble. Indexical processing skills were measured using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Vocabulary knowledge was assessed with the WordFam word familiarity test, and executive functioning was assessed with the BRIEF-A (Behavioral Rating Inventory of Executive Function - Adult Version) self-report questionnaire. Scores from the non-native listeners on behavioral tasks and self-report questionnaires were compared with scores obtained from native listeners tested in a previous study and were examined for individual differences. RESULTS: Non-native keyword recognition scores were significantly lower on PRESTO sentences than on HINT sentences. Non-native listeners' keyword recognition scores were also lower than native listeners' scores on both sentence recognition tasks. Differences in performance on the sentence recognition tasks between non-native and native listeners were larger on PRESTO than on HINT, although group differences varied by signal-to-noise ratio. The non-native and native groups also differed in the ability to categorize talkers by region of origin and in vocabulary knowledge. Individual non-native word recognition accuracy on PRESTO sentences in multitalker babble at more favorable signal-to-noise ratios was found to be related to several BRIEF-A subscales and composite scores. However, non-native performance on PRESTO was not related to regional dialect categorization, talker and gender discrimination, or vocabulary knowledge. CONCLUSIONS: High-variability sentences in multitalker babble were particularly challenging for non-native listeners. Difficulty under high-variability testing conditions was related to lack of experience with the L2, especially L2 sociolinguistic information, compared with native listeners. Individual differences among the non-native listeners were related to weaknesses in core neurocognitive abilities affecting behavioral control in everyday life

IUPUIScholarWorks

PubMed Central

Dichotic listening with specific, general, abstract and emotional words – semantic judgments and reaction times.

Author: Brännström Jonas
Horne Merle
Lindgren Magnus
Mårtensson Frida
Roll Mikael
Publication venue
Publication date: 01/01/2014
Field of study

Lund University Publications