159 research outputs found

    Non-Intrusive Speech Intelligibility Prediction

    Get PDF

    The directional effect of target position on spatial selective auditory attention

    Get PDF
    Spatial selective auditory attention plays a crucial role in listening in a mixture of competing speech sounds. Previous neuroimaging studies have reported alpha band neural activity modulated by auditory attention, along with the alpha lateralization corresponding to attentional focus. A greater cortical representation of the attended speech envelope compared to the ignored speech envelope was also found, a phenomenon known as \u27neural speech tracking’. However, little is known about the neural activities when attentional focus is directed on speech sounds from behind the listener, even though understanding speech from behind is a common and essential aspect of daily life. The objectives of this study are to investigate the impact of four distinct target positions (left, right, front, and particularly, behind) on spatial selective auditory attention by concurrently assessing 1) spatial selective speech identification, 2) oscillatory alpha-band power, and 3) neural speech tracking. Fifteen young adults with normal hearing (NH) were enrolled in this study (M = 21.40, ages 18-29; 10 females). The selective speech identification task indicated that the target position presented at back was the most challenging condition, followed by the front condition, with the lateral condition being the least demanding. The normalized alpha power was modulated by target position and the power was significantly lateralized to either the left or right side, not the front and back. The parieto-occipital alpha power in front-back configuration was significantly lower than the results for left-right listening configuration and the normalized alpha power in the back condition was significantly higher than in the front condition. The speech tracking function of to-be-attended speech envelope was affected by the direction of ix target stream. The behavioral outcome (selective speech identification) was correlated with parieto-occipital alpha power and neural speech tracking correlation coefficient as neural correlates of auditory attention, but there was no significant correlation between alpha power and neural speech tracking. The results suggest that in addition to existing mechanism theories, it might be necessary to consider how our brain responds depending on the location of the sound in order to interpret the neural correlates and behavioral consequences in a meaningful way as well as a potential application of neural speech tracking in studies on spatial selective hearing

    Informed Sound Source Localization for Hearing Aid Applications

    Get PDF

    The acoustics of concentric sources and receivers – human voice and hearing applications

    Get PDF
    One of the most common ways in which we experience environments acoustically is by listening to the reflections of our own voice in a space. By listening to our own voice we adjust its characteristics to suit the task and audience. This is of particular importance in critical voice tasks such as actors or singers on a stage with no additional electroacoustic or other amplification (e.g. in ear monitors, loudspeakers, etc.). Despite the usualness of this situation, there are very few acoustic measurements aimed to quantify it and even fewer that address the problem of having a source and receiver that are very closely located. The aim of this thesis is to introduce new measurement transducers and methods that quantify correctly this situation. This is achieved by analysing the characteristics of the human as a source, a receiver and their interaction in close proximity when placed in acoustical environments. The characteristics of the human voice and human ear are analysed in this thesis in a similar manner as a loudspeaker or microphone would be analysed. This provides the basis for further analysis by making them analogous to measurement transducers. These results are then used to explore the consequences of having a source and receiver very closely located using acoustic room simulation. Different techniques for processing data using directional transducers in real rooms are introduced. The majority of the data used in this thesis was obtained in rooms used for performance. The final chapters of this thesis include details of the design and construction of a concentric directional transducer, where an array of microphones and loudspeakers occupy the same structure. Finally, sample measurements with this transducer are presented

    Spatial release from masking in children with and without auditory processing disorder in real and virtual auditory environments

    Get PDF
    Auditory Processing Disorder (APD) is a developmental disorder characterised by difficulties in listening to speech-in-noise despite normal audiometric thresholds. It is still poorly understood and much disputed and there is a need for better diagnostic tools. One promising finding is that some children referred for APD assessment have a reduced spatial release from masking (SRM). Current clinical tests measure SRM in virtual auditory environments created from head-related transfer functions (HRTFs) of a standardised adult head. Adults and children, however, have different head dimensions and mismatched HRTFs are known to affect aspects of binaural hearing like localisation. There has been little research on HRTFs in children and it is unclear whether a large mismatch can impact speech perception, especially for children with APD who have difficulties with accurately processing auditory information. In this project, we examined the effect of nonindividualised virtual auditory environments on the SRM in adults and children with and without APD. The first study with normal-hearing adults compared environments created from individually measured HRTFs and two nonindividualised sets of HRTFs to a real anechoic environment. Speech reception thresholds (SRTs) were measured for target sentences at 0° and two symmetric speech maskers at 0° or ±90° azimuth. No significant effect of auditory environment on SRTs and SRM could be observed. A larger study was then conducted with APD and typically-developing children aged 7 to 12 years. Individual HRTFs were measured for each child. The SRM was measured in environments created from these individualised HRTFs or artificial head HRTFs and in the real anechoic environment. To assess the influence of spectral cues, SRTs were also measured for HRTFs from a spherical head model that only contains interaural time and level differences. Additionally, the study included an extended high-frequency audiogram, a receptive language test and two parental questionnaires. The SRTs of children with APD were worse than those of typically-developing children in all conditions but SRMs were similar. Only small differences in SRTs were found across environments, mainly for the spherical head HRTFs. SRTs in children were higher than in adults but improved with age. APD children also had higher hearing thresholds and performed worse in the language test

    Computational modelling of neural mechanisms underlying natural speech perception

    Get PDF
    Humans are highly skilled at the analysis of complex auditory scenes. In particular, the human auditory system is characterized by incredible robustness to noise and can nearly effortlessly isolate the voice of a specific talker from even the busiest of mixtures. However, neural mechanisms underlying these remarkable properties remain poorly understood. This is mainly due to the inherent complexity of speech signals and multi-stage, intricate processing performed in the human auditory system. Understanding these neural mechanisms underlying speech perception is of interest for clinical practice, brain-computer interfacing and automatic speech processing systems. In this thesis, we developed computational models characterizing neural speech processing across different stages of the human auditory pathways. In particular, we studied the active role of slow cortical oscillations in speech-in-noise comprehension through a spiking neural network model for encoding spoken sentences. The neural dynamics of the model during noisy speech encoding reflected speech comprehension of young, normal-hearing adults. The proposed theoretical model was validated by predicting the effects of non-invasive brain stimulation on speech comprehension in an experimental study involving a cohort of volunteers. Moreover, we developed a modelling framework for detecting the early, high-frequency neural response to the uninterrupted speech in non-invasive neural recordings. We applied the method to investigate top-down modulation of this response by the listener's selective attention and linguistic properties of different words from a spoken narrative. We found that in both cases, the detected responses of predominantly subcortical origin were significantly modulated, which supports the functional role of feedback, between higher- and lower levels stages of the auditory pathways, in speech perception. The proposed computational models shed light on some of the poorly understood neural mechanisms underlying speech perception. The developed methods can be readily employed in future studies involving a range of experimental paradigms beyond these considered in this thesis.Open Acces

    Sound Localization in Single-Sided Deaf Participants Provided With a Cochlear Implant

    Get PDF
    Spatial hearing is crucial in real life but deteriorates in participants with severe sensorineural hearing loss or single-sided deafness. This ability can potentially be improved with a unilateral cochlear implant (CI). The present study investigated measures of sound localization in participants with single-sided deafness provided with a CI. Sound localization was measured separately at eight loudspeaker positions (4°, 30°, 60°, and 90°) on the CI side and on the normal-hearing side. Low- and high-frequency noise bursts were used in the tests to investigate possible differences in the processing of interaural time and level differences. Data were compared to normal-hearing adults aged between 20 and 83. In addition, the benefit of the CI in speech understanding in noise was compared to the localization ability. Fifteen out of 18 participants were able to localize signals on the CI side and on the normal-hearing side, although performance was highly variable across participants. Three participants always pointed to the normal-hearing side, irrespective of the location of the signal. The comparison with control data showed that participants had particular difficulties localizing sounds at frontal locations and on the CI side. In contrast to most previous results, participants were able to localize low-frequency signals, although they localized high-frequency signals more accurately. Speech understanding in noise was better with the CI compared to testing without CI, but only at a position where the CI also improved sound localization. Our data suggest that a CI can, to a large extent, restore localization in participants with single-sided deafness. Difficulties may remain at frontal locations and on the CI side. However, speech understanding in noise improves when wearing the CI. The treatment with a CI in these participants might provide real-world benefits, such as improved orientation in traffic and speech understanding in difficult listening situations

    Improving the Speech Intelligibility By Cochlear Implant Users

    Get PDF
    In this thesis, we focus on improving the intelligibility of speech for cochlear implants (CI) users. As an auditory prosthetic device, CI can restore hearing sensations for most patients with profound hearing loss in both ears in a quiet background. However, CI users still have serious problems in understanding speech in noisy and reverberant environments. Also, bandwidth limitation, missing temporal fine structures, and reduced spectral resolution due to a limited number of electrodes are other factors that raise the difficulty of hearing in noisy conditions for CI users, regardless of the type of noise. To mitigate these difficulties for CI listener, we investigate several contributing factors such as the effects of low harmonics on tone identification in natural and vocoded speech, the contribution of matched envelope dynamic range to the binaural benefits and contribution of low-frequency harmonics to tone identification in quiet and six-talker babble background. These results revealed several promising methods for improving speech intelligibility for CI patients. In addition, we investigate the benefits of voice conversion in improving speech intelligibility for CI users, which was motivated by an earlier study showing that familiarity with a talker’s voice can improve understanding of the conversation. Research has shown that when adults are familiar with someone’s voice, they can more accurately – and even more quickly – process and understand what the person is saying. This theory identified as the “familiar talker advantage” was our motivation to examine its effect on CI patients using voice conversion technique. In the present research, we propose a new method based on multi-channel voice conversion to improve the intelligibility of transformed speeches for CI patients
    • …
    corecore