2,127 research outputs found

    Assessing the effect of noise-reduction to the intelligibility of low-pass filtered speech

    Get PDF
    Given the fact that most hearing-impaired listeners have low-frequency residual hearing, the present work assessed the effect of applying commonly-used singlechannel noise-reduction (NR) algorithms to improve the intelligibility of low-pass filtered speech, which simulates the effect of understanding speech with low-frequency residual hearing of hearing-impaired patients. In addition, this study was performed with Mandarin speech, which is characterized by its significant contribution of information present in (low-frequency dominated) vowels to speech intelligibility. Mandarin sentences were corrupted by steady-state speech-shaped noise and processed by four types (i.e., subspace, statistical-modeling, spectral-subtractive, and Wiener-filtering) of single-channel NR algorithms. The processed sentences were played to normal-hearing listeners for recognition. Experimental results showed that existing single-channel NR algorithms were unable to improve the intelligibility of low-pass filtered Mandarin sentences. Wiener-filtering had the least negative influence to the intelligibility of low-pass filtered speech among the four types of single-channel NR algorithms examined

    Effects of noise suppression and envelope dynamic range compression on the intelligibility of vocoded sentences for a tonal language

    Get PDF
    Vocoder simulation studies have suggested that the carrier signal type employed affects the intelligibility of vocoded speech. The present work further assessed how carrier signal type interacts with additional signal processing, namely, single-channel noise suppression and envelope dynamic range compression, in determining the intelligibility of vocoder simulations. In Experiment 1, Mandarin sentences that had been corrupted by speech spectrum-shaped noise (SSN) or two-talker babble (2TB) were processed by one of four single-channel noise-suppression algorithms before undergoing tone-vocoded (TV) or noise-vocoded (NV) processing. In Experiment 2, dynamic ranges of multiband envelope waveforms were compressed by scaling of the mean-removed envelope waveforms with a compression factor before undergoing TV or NV processing. TV Mandarin sentences yielded higher intelligibility scores with normal-hearing (NH) listeners than did noise-vocoded sentences. The intelligibility advantage of noise-suppressed vocoded speech depended on the masker type (SSN vs 2TB). NV speech was more negatively influenced by envelope dynamic range compression than was TV speech. These findings suggest that an interactional effect exists between the carrier signal type employed in the vocoding process and envelope distortion caused by signal processing

    Coding Strategies for Cochlear Implants Under Adverse Environments

    Get PDF
    Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise

    Speech understanding and listening effort in noise with a new speech processing algorithm

    Get PDF
    This study examined the effect of a new speech processing strategy (SpeechZone2) in a commercially available hearing aid on speech understanding in noise and self-reported listening effort. Seven adult, experienced hearing aid users (2 males, 5 females; mean age = 64.6 years) with mild to severe, sloping sensorineural hearing loss participated in this study. Binaural Unitron Flex receiver in the ear style hearing aids with closed domes were used to provide the manufacturer prescribed amplification for each participant. The hearing aids were programmed with two separate memories: 1) omnidirectional microphone without SpeecZone2 processing, and 2) adaptive directionality with SpeechZone2 processing. The participants were seated in the center of a five loudspeaker fixed array. HINT scores (dB SNR required for 50% speech understanding) with the speech source at 00, 900, 1800, and 2700 azimuths were measured for each program while uncorrelated speech babble noise was presented simultaneously from four speakers. The participants were also asked to fill out a short questionnaire on listening effort after each condition. Results showed that the new speech processing algorithm (adaptive directionality with SpeechZone2) did not improve speech understanding in noise compared to the omnidirectional microphone condition (F(1,6)= 1.723; p = 0.237). Pairwise comparison with Bonferroni corrections (α=0.0125) indicated that there was a significant improvement only when speech was presented from 270 degree azimuth (p=0.002). The ANOVA also revealed a significant effect of the speech source location (F(3,18)= 5.62; p=0.02). Regardless of the directionality and speech processing, the participants performed better when speech was presented from the sides (900 and 2700). A Wilcoxon signed-rank nonparametric test showed that there was no significant difference in the self-reported scores in the listening effort questionnaire (Z = 0.637, p = 0.39). There was a large intersubject variability noticed in this small sample size

    Components of Auditory Closure

    Get PDF
    Auditory closure (AC) is an aspect of auditory processing that is crucial for understanding speech in background noise. It is a set of abilities that allows listeners to understand speech in the absence of important information, both spectral and temporal. AC is evaluated using monaural low-redundancy speech tasks: low-pass filtered words (LPFW), time-compressed words (TCW), and words-in-noise (WiN). Although not previously used, phonemic restoration with words (PhRW) is also a speech task that has been proposed as a measure of AC. In the present study, four tasks of AC, that are listed above, were used to evaluate AC skills in 50 adult females with normal hearing. Using pair-wise correlations, there were no significant relationships among LPFW, TCW, and WiN. As a result, these three tasks were considered to be independent components of AC that represented the AC abilities of spectral reconstruction, temporal resolution, and auditory induction, respectively. Multiple linear regression analysis with LFPW, TCW, and WiN as variables revealed that PhRW is accomplished using temporal resolution. The findings of this study show that no single task of AC is representative of the entire process and that further research is warranted to more completely define the skills that make AC possible

    Contributions of local speech encoding and functional connectivity to audio-visual speech perception

    Get PDF
    Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments

    Modeling speech intelligibility based on the signal-to-noise envelope power ratio

    Get PDF
    • …
    corecore