53 research outputs found

    Meta-Analysis on the Identification of Linguistic and Emotional Prosody in Cochlear Implant Users and Vocoder Simulations

    Get PDF
    Objectives: This study quantitatively assesses how cochlear implants (CIs) and vocoder simulations of CIs influence the identification of linguistic and emotional prosody in nontonal languages. By means of meta-analysis, it was explored how accurately CI users and normal-hearing (NH) listeners of vocoder simulations (henceforth: simulation listeners) identify prosody compared with NH listeners of unprocessed speech (henceforth: NH listeners), whether this effect of electric hearing differs between CI users and simulation listeners, and whether the effect of electric hearing is influenced by the type of prosody that listeners identify or by the availability of specific cues in the speech signal. Design: Records were found by searching the PubMed Central, Web of Science, Scopus, Science Direct, and PsycINFO databases (January 2018) using the search terms “cochlear implant prosody” and “vocoder prosody.” Records (published in English) were included that reported results of experimental studies comparing CI users’ and/or simulation listeners’ identification of linguistic and/or emotional prosody in nontonal languages to that of NH listeners (all ages included). Studies that met the inclusion criteria were subjected to a multilevel random-effects meta-analysis. Results: Sixty-four studies reported in 28 records were included in the meta-analysis. The analysis indicated that CI users and simulation listeners were less accurate in correctly identifying linguistic and emotional prosody compared with NH listeners, that the identification of emotional prosody was more strongly compromised by the electric hearing speech signal than linguistic prosody was, and that the low quality of transmission of fundamental frequency (f0) through the electric hearing speech signal was the main cause of compromised prosody identification in CI users and simulation listeners. Moreover, results indicated that the accuracy with which CI users and simulation listeners identified linguistic and emotional prosody was comparable, suggesting that vocoder simulations with carefully selected parameters can provide a good estimate of how prosody may be identified by CI users. Conclusions: The meta-analysis revealed a robust negative effect of electric hearing, where CIs and vocoder simulations had a similar negative influence on the identification of linguistic and emotional prosody, which seemed mainly due to inadequate transmission of f0 cues through the degraded electric hearing speech signal of CIs and vocoder simulations

    Noise reduction algorithms and performance metrics for improving speech reception in noise by cochlear-implant users

    Get PDF
    Thesis (Ph. D.)--Harvard University--MIT Division of Health Sciences and Technology, 2005.Includes bibliographical references (p. 229-233).This thesis addresses the design and evaluation of algorithms to improve speech reception for cochlear-implant (CI) users in adverse listening environments. We develop and assess performance metrics for use in the algorithm design process; such metrics make algorithm evaluation efficient, consistent, and subject independent. One promising performance metric is the Speech Transmission Index (STI), which is well correlated with speech reception by normal-hearing listeners for additive noise and reverberation. We expect the STI will effectively predict speech reception by CI users since typical CI sound-processing strategies, like the STI, rely on the envelope signals in frequency bands spanning the speech spectrum. However, STI-based metrics have proven unsatisfactory for assessing the effects of nonlinear operations on the intelligibility of processed speech. In this work we consider modifications to the STI that account for nonlinear operations commonly found in CI sound-processing and noise reduction algorithms. We consider a number of existing speech-based STI metrics and propose novel metrics applicable to nonlinear operations. A preliminary evaluation results in the selection of three candidate metrics for extensive evaluation. In four central experiments, we consider the effects of acoustic degradation, N-of-M processing, spectral subtraction, and binaural noise reduction on the intelligibility of CI-processed speech. We assess the ability of the candidate metrics to predict speech reception scores.(cont.) Subjects include CI users as well as normal-hearing subjects listening to a noise-vocoder simulation of CI sound-processing. Our results show that: 1) both spectral subtraction and binaural noise reduction improve the intelligibility of CI-processed speech and 2) of the candidate metrics, one method (the normalized correlation metric) consistently predicts the major trends in speech reception scores for all four experiments.by Raymond Lee Goldsworthy.Ph.D

    Decoding auditory attention and neural language processing in adverse conditions and different listener groups

    Get PDF
    This thesis investigated subjective, behavioural and neurophysiological (EEG) measures of speech processing in various adverse conditions and with different listener groups. In particular, this thesis focused on different neural processing stages and their relationship with auditory attention, effort, and measures of speech intelligibility. Study 1 set the groundwork by establishing a toolbox of various neural measures to investigate online speech processing, from the frequency following response (FFR) and cortical measures of speech processing, to the N400, a measure of lexico-semantic processing. Results showed that peripheral processing is heavily influenced by stimulus characteristics such as degradation, whereas central processing units are more closely linked to higher-order phenomena such as speech intelligibility. In Study 2, a similar experimental paradigm was used to investigate differences in neural processing between a hearing-impaired and a normal-hearing group. Subjects were presented with short stories in different levels of multi-talker babble noise, and with different settings on their hearing aids. Findings indicate that, particularly at lower noise levels, the hearing-impaired group showed much higher cortical entrainment than the normal- hearing group, despite similar levels of speech recognition. Intersubject correlation, another global neural measure of auditory attention, however, was similarly affected by noise levels in both the hearing-impaired and the normal-hearing group. This finding indicates extra processing in the hearing-impaired group only on the level of the auditory cortex. Study 3, in contrast to Studies 1 and 2 (which both investigated the effects of bottom-up factors on neural processing), examined the links between entrainment and top-down factors, specifically motivation; as well as reasons for the 5 higher entrainment found in hearing-impaired subjects in Study 2. Results indicated that, while behaviourally there was no difference between incentive and non-incentive conditions, neurophysiological measures of attention such as intersubject correlation were affected by the presence of an incentive to perform better. Moreover, using a specific degradation type resulted in subjects’ increased cortical entrainment under degraded conditions. These findings support the hypothesis that top-down factors such as motivation influence neurophysiological measures; and that higher entrainment to degraded speech might be triggered specifically by the reduced availability of spectral detail contained in speech

    Audiovisual speech perception in cochlear implant patients

    Get PDF
    Hearing with a cochlear implant (CI) is very different compared to a normal-hearing (NH) experience, as the CI can only provide limited auditory input. Nevertheless, the central auditory system is capable of learning how to interpret such limited auditory input such that it can extract meaningful information within a few months after implant switch-on. The capacity of the auditory cortex to adapt to new auditory stimuli is an example of intra-modal plasticity — changes within a sensory cortical region as a result of altered statistics of the respective sensory input. However, hearing deprivation before implantation and restoration of hearing capacities after implantation can also induce cross-modal plasticity — changes within a sensory cortical region as a result of altered statistics of a different sensory input. Thereby, a preserved cortical region can, for example, support a deprived cortical region, as in the case of CI users which have been shown to exhibit cross-modal visual-cortex activation for purely auditory stimuli. Before implantation, during the period of hearing deprivation, CI users typically rely on additional visual cues like lip-movements for understanding speech. Therefore, it has been suggested that CI users show a pronounced binding of the auditory and visual systems, which may allow them to integrate auditory and visual speech information more efficiently. The projects included in this thesis investigate auditory, and particularly audiovisual speech processing in CI users. Four event-related potential (ERP) studies approach the matter from different perspectives, each with a distinct focus. The first project investigates how audiovisually presented syllables are processed by CI users with bilateral hearing loss compared to NH controls. Previous ERP studies employing non-linguistic stimuli and studies using different neuroimaging techniques found distinct audiovisual interactions in CI users. However, the precise timecourse of cross-modal visual-cortex recruitment and enhanced audiovisual interaction for speech related stimuli is unknown. With our ERP study we fill this gap, and we present differences in the timecourse of audiovisual interactions as well as in cortical source configurations between CI users and NH controls. The second study focuses on auditory processing in single-sided deaf (SSD) CI users. SSD CI patients experience a maximally asymmetric hearing condition, as they have a CI on one ear and a contralateral NH ear. Despite the intact ear, several behavioural studies have demonstrated a variety of beneficial effects of restoring binaural hearing, but there are only few ERP studies which investigate auditory processing in SSD CI users. Our study investigates whether the side of implantation affects auditory processing and whether auditory processing via the NH ear of SSD CI users works similarly as in NH controls. Given the distinct hearing conditions of SSD CI users, the question arises whether there are any quantifiable differences between CI user with unilateral hearing loss and bilateral hearing loss. In general, ERP studies on SSD CI users are rather scarce, and there is no study on audiovisual processing in particular. Furthermore, there are no reports on lip-reading abilities of SSD CI users. To this end, in the third project we extend the first study by including SSD CI users as a third experimental group. The study discusses both differences and similarities between CI users with bilateral hearing loss and CI users with unilateral hearing loss as well as NH controls and provides — for the first time — insights into audiovisual interactions in SSD CI users. The fourth project investigates the influence of background noise on audiovisual interactions in CI users and whether a noise-reduction algorithm can modulate these interactions. It is known that in environments with competing background noise listeners generally rely more strongly on visual cues for understanding speech and that such situations are particularly difficult for CI users. As shown in previous auditory behavioural studies, the recently introduced noise-reduction algorithm "ForwardFocus" can be a useful aid in such cases. However, the questions whether employing the algorithm is beneficial in audiovisual conditions as well and whether using the algorithm has a measurable effect on cortical processing have not been investigated yet. In this ERP study, we address these questions with an auditory and audiovisual syllable discrimination task. Taken together, the projects included in this thesis contribute to a better understanding of auditory and especially audiovisual speech processing in CI users, revealing distinct processing strategies employed to overcome the limited input provided by a CI. The results have clinical implications, as they suggest that clinical hearing assessments, which are currently purely auditory, should be extended to audiovisual assessments. Furthermore, they imply that rehabilitation including audiovisual training methods may be beneficial for all CI user groups for quickly achieving the most effective CI implantation outcome

    The use of acoustic cues in phonetic perception: Effects of spectral degradation, limited bandwidth and background noise

    Get PDF
    Hearing impairment, cochlear implantation, background noise and other auditory degradations result in the loss or distortion of sound information thought to be critical to speech perception. In many cases, listeners can still identify speech sounds despite degradations, but understanding of how this is accomplished is incomplete. Experiments presented here tested the hypothesis that listeners would utilize acoustic-phonetic cues differently if one or more cues were degraded by hearing impairment or simulated hearing impairment. Results supported this hypothesis for various listening conditions that are directly relevant for clinical populations. Analysis included mixed-effects logistic modeling of contributions of individual acoustic cues for various contrasts. Listeners with cochlear implants (CIs) or normal-hearing (NH) listeners in CI simulations showed increased use of acoustic cues in the temporal domain and decreased use of cues in the spectral domain for the tense/lax vowel contrast and the word-final fricative voicing contrast. For the word-initial stop voicing contrast, NH listeners made less use of voice-onset time and greater use of voice pitch in conditions that simulated high-frequency hearing impairment and/or masking noise; influence of these cues was further modulated by consonant place of articulation. A pair of experiments measured phonetic context effects for the "s/sh" contrast, replicating previously observed effects for NH listeners and generalizing them to CI listeners as well, despite known deficiencies in spectral resolution for CI listeners. For NH listeners in CI simulations, these context effects were absent or negligible. Audio-visual delivery of this experiment revealed enhanced influence of visual lip-rounding cues for CI listeners and NH listeners in CI simulations. Additionally, CI listeners demonstrated that visual cues to gender influence phonetic perception in a manner consistent with gender-related voice acoustics. All of these results suggest that listeners are able to accommodate challenging listening situations by capitalizing on the natural (multimodal) covariance in speech signals. Additionally, these results imply that there are potential differences in speech perception by NH listeners and listeners with hearing impairment that would be overlooked by traditional word recognition or consonant confusion matrix analysis

    Studies on auditory processing of spatial sound and speech by neuromagnetic measurements and computational modeling

    Get PDF
    This thesis addresses the auditory processing of spatial sound and speech. The thesis consists of two research branches: one, magnetoencephalographic (MEG) brain measurements on spatial localization and speech perception, and two, construction of computational auditory scene analysis models, which exploit spatial cues and other cues that are robust in reverberant environments. In the MEG research branch, we have addressed the processing of the spatial stimuli in the auditory cortex through studies concentrating to the following issues: processing of sound source location with realistic spatial stimuli, spatial processing of speech vs. non-speech stimuli, and finally processing of range of spatial location cues in the auditory cortex. Our main findings are as follows: Both auditory cortices respond more vigorously to contralaterally presented sound, whereby responses exhibit systematic tuning to the sound source direction. Responses and response dynamics are generally larger in the right hemisphere, which indicates right hemispheric specialization in the spatial processing. These observations hold over the range of speech and non-speech stimuli. The responses to speech sounds are decreased markedly if the natural periodic speech excitation is changed to random noise sequence. Moreover, the activation strength of the right auditory cortex seems to reflect processing of spatial cues, so that the dynamical differences are larger and the angular organization is more orderly for realistic spatial stimuli compared to impoverished spatial stimuli (e.g. isolated interaural time and level difference cues). In the auditory modeling part, we constructed models for the recognition of speech in the presence of interference. Firstly, we constructed a system using binaural cues in order to segregate target speech from spatially separated interference, and showed that the system outperforms a conventional approach at low signal-to-noise ratios. Secondly, we constructed a single channel system that is robust in room reverberation using strong speech modulations as robust cues, and showed that it outperforms a baseline approach in the most reverberant test conditions. In this case, the baseline approach was specifically optimized for recognition of speech in reverberation. In summary, this thesis addresses the auditory processing of spatial sound and speech in both brain measurement and auditory modeling. The studies aim to clarify cortical processes of sound localization, and to construct computational auditory models for sound segregation exploiting spatial cues, and strong speech modulations as robust cues in reverberation.reviewe

    Recognition and cortical haemodynamics of vocal emotions-an fNIRS perspective

    Get PDF
    Normal-hearing listeners rely heavily on variations in the fundamental frequency (F0) of speech to identify vocal emotions. Without reliable F0 cues, as is the case for cochlear implant users, listeners’ ability to extract emotional meaning from speech is reduced. This thesis describes the development of an objective measure of vocal emotion recognition. The program of three experiments investigates: 1) NH listeners’ abilities to use F0, intensity, and speech-rate cues to recognise emotions; 2) cortical activity associated with individual vocal emotions assessed using functional near-infrared spectroscopy (fNIRS); 3) cortical activity evoked by vocal emotions in natural speech and in speech with uninformative F0 using fNIRS

    Approche psychophysique de la perception auditive para et extra linguistique chez le sujet sourd post lingual implanté cochléaire

    Get PDF
    Les bénéfices liés à l'implantation cochléaire sont connus pour la discrimination de la parole dans le silence. En revanche, leurs capacités perceptives pour les informations para linguistiques et extra linguistiques sont moins décrites. Nos travaux expérimentaux ont consisté à caractériser les éventuels déficits observés et expliquer leurs mécanismes pour la catégorisation auditive, pour la perception de la prosodie et de la musique. Nous avons dans ce but réalisé plusieurs études psychophysiques dans lesquelles leurs performances étaient comparées à celles de sujets contrôle normo entendant. Nos résultats font état d'un important déficit de perception auditive para et extra linguistique chez les sujets implantés cochléaires, vraisemblablement lié à deux principales raisons. La première réside dans la dégradation spectrale du signal acoustique par le processeur vocal de l'implant et l'insuffisance de restitution de l'information relative à la fréquence fondamentale. C'est pourquoi le déficit apparaît réduit voire absent chez les sujets bénéficiant d'une audition résiduelle associée à l'implant. La deuxième raison tient à la réorganisation corticale suivant la période de surdité, qui facilite le traitement de la parole mais pourrait se révéler délétère pour la perception des autres informations auditives.Cochlear implants have been shown to restore excellent speech recognition in quiet. However, post lingually deafened adults experience persistent handicap following cochlear implantation, which might be related to their perception abilities in other auditory fields than speech. In this report, we conducted several psychophysical experiments, which aimed at assessing the characteristics of their auditory categorization and their abilities for prosody and music perception. Normal hearing subjects were also tested in a control group. We found a strong and durable deficit, underpinned by two plausible mechanisms. First, the acoustic signal processing through the implant leads to an important spectral impoverishment and limits access to fundamental frequency information. Hence some cochlear implant recipients with substantial low-frequency residual hearing may achieve near normal performance. Brain plasticity following auditory deprivation facilitates the compensatory strategies for speech recognition but might also influence negatively the outcomes in para linguistic and extra linguistic information
    • …
    corecore