984 research outputs found

    A Binaural Cochlear Implant Sound Coding Strategy Inspired by the Contralateral Medial Olivocochlear Reflex

    Get PDF
    [EN] Objectives: In natural hearing, cochlear mechanical compression is dynamically adjusted via the efferent medial olivocochlear reflex (MOCR). These adjustments probably help understanding speech in noisy environments and are not available to the users of current cochlear implants (CIs). The aims of the present study are to: (1) present a binaural CI sound processing strategy inspired by the control of cochlear compression provided by the contralateral MOCR in natural hearing; and (2) assess the benefits of the new strategy for understanding speech presented in competition with steady noise with a speech-like spectrum in various spatial configurations of the speech and noise sources. Design: Pairs of CI sound processors (one per ear) were constructed to mimic or not mimic the effects of the contralateral MOCR on compression. For the nonmimicking condition (standard strategy or STD), the two processors in a pair functioned similarly to standard clinical processors (i.e., with fixed back-end compression and independently of each other). When configured to mimic the effects of the MOCR (MOC strategy), the two processors communicated with each other and the amount of backend compression in a given frequency channel of each processor in the pair decreased/increased dynamically (so that output levels dropped/ increased) with increases/decreases in the output energy from the corresponding frequency channel in the contralateral processor. Speech reception thresholds in speech-shaped noise were measured for 3 bilateral CI users and 2 single-sided deaf unilateral CI users. Thresholds were compared for the STD and MOC strategies in unilateral and bilateral listening conditions and for three spatial configurations of the speech and noise sources in simulated free-field conditions: speech and noise sources colocated in front of the listener, speech on the left ear with noise in front of the listener, and speech on the left ear with noise on the right ear. In both bilateral and unilateral listening, the electrical stimulus delivered to the test ear(s) was always calculated as if the listeners were wearing bilateral processors. Results: In both unilateral and bilateral listening conditions, mean speech reception thresholds were comparable with the two strategies for colocated speech and noise sources, but were at least 2 dB lower (better) with the MOC than with the STD strategy for spatially separated speech and noise sources. In unilateral listening conditions, mean thresholds improved with increasing the spatial separation between the speech and noise sources regardless of the strategy but the improvement was significantly greater with the MOC strategy. In bilateral listening conditions, thresholds improved significantly with increasing the speech-noise spatial separation only with the MOC strategy. Conclusions: The MOC strategy (1) significantly improved the intelligibility of speech presented in competition with a spatially separated noise source, both in unilateral and bilateral listening conditions; (2) produced significant spatial release from masking in bilateral listening conditions, something that did not occur with fixed compression; and (3) enhanced spatial release from masking in unilateral listening conditions. The MOC strategy as implemented here, or a modified version of it, may be usefully applied in CIs and in hearing aids

    Objective Assessment of Machine Learning Algorithms for Speech Enhancement in Hearing Aids

    Get PDF
    Speech enhancement in assistive hearing devices has been an area of research for many decades. Noise reduction is particularly challenging because of the wide variety of noise sources and the non-stationarity of speech and noise. Digital signal processing (DSP) algorithms deployed in modern hearing aids for noise reduction rely on certain assumptions on the statistical properties of undesired signals. This could be disadvantageous in accurate estimation of different noise types, which subsequently leads to suboptimal noise reduction. In this research, a relatively unexplored technique based on deep learning, i.e. Recurrent Neural Network (RNN), is used to perform noise reduction and dereverberation for assisting hearing-impaired listeners. For noise reduction, the performance of the deep learning model was evaluated objectively and compared with that of open Master Hearing Aid (openMHA), a conventional signal processing based framework, and a Deep Neural Network (DNN) based model. It was found that the RNN model can suppress noise and improve speech understanding better than the conventional hearing aid noise reduction algorithm and the DNN model. The same RNN model was shown to reduce reverberation components with proper training. A real-time implementation of the deep learning model is also discussed

    Signal processing algorithms for digital hearing aids

    Get PDF
    Hearing loss is a problem that severely affects the speech communication and disqualify most hearing-impaired people from holding a normal life. Although the vast majority of hearing loss cases could be corrected by using hearing aids, however, only a scarce of hearing-impaired people who could be benefited from hearing aids purchase one. This irregular use of hearing aids arises from the existence of a problem that, to date, has not been solved effectively and comfortably: the automatic adaptation of the hearing aid to the changing acoustic environment that surrounds its user. There are two approaches aiming to comply with it. On the one hand, the "manual" approach, in which the user has to identify the acoustic situation and choose the adequate amplification program has been found to be very uncomfortable. The second approach requires to include an automatic program selection within the hearing aid. This latter approach is deemed very useful by most hearing aid users, even if its performance is not completely perfect. Although the necessity of the aforementioned sound classification system seems to be clear, its implementation is a very difficult matter. The development of an automatic sound classification system in a digital hearing aid is a challenging goal because of the inherent limitations of the Digital Signal Processor (DSP) the hearing aid is based on. The underlying reason is that most digital hearing aids have very strong constraints in terms of computational capacity, memory and battery, which seriously limit the implementation of advanced algorithms in them. With this in mind, this thesis focuses on the design and implementation of a prototype for a digital hearing aid able to automatically classify the acoustic environments hearing aid users daily face on and select the amplification program that is best adapted to such environment aiming at enhancing the speech intelligibility perceived by the user. The most important contribution of this thesis is the implementation of a prototype for a digital hearing aid that automatically classifies the acoustic environment surrounding its user and selects the most appropriate amplification program for such environment, aiming at enhancing the sound quality perceived by the user. The battery life of this hearing aid is 140 hours, which has been found to be very similar to that of hearing aids in the market, and what is of key importance, there is still about 30% of the DSP resources available for implementing other algorithms

    Evaluation of the sparse coding shrinkage noise reduction algorithm for the hearing impaired

    No full text
    Although there are numerous single-channel noise reduction strategies to improve speech perception in a noisy environment, most of them can only improve speech quality but not improve speech intelligibility for normal hearing (NH) or hearing impaired (HI) listeners. Exceptions that can improve speech intelligibility currently are only those that require a priori statistics of speech or noise. Most of the noise reduction algorithms in hearing aids are adopted directly from the algorithms for NH listeners without taking into account of the hearing loss factors within HI listeners. HI listeners suffer more in speech intelligibility than NH listeners in the same noisy environment. Further study of monaural noise reduction algorithms for HI listeners is required.The motivation is to adapt a model-based approach in contrast to the conventional Wiener filtering approach. The model-based algorithm called sparse coding shrinkage (SCS) was proposed to extract key speech information from noisy speech. The SCS algorithm was evaluated by comparison with another state-of-the-art Wiener filtering approach through speech intelligibility and quality tests using 9 NH and 9 HI listeners. The SCS algorithm matched the performance of the Wiener filtering algorithm in speech intelligibility and speech quality. Both algorithms showed some intelligibility improvements for HI listeners but not at all for NH listeners. The algorithms improved speech quality for both HI and NH listeners.Additionally, a physiologically-inspired hearing loss simulation (HLS) model was developed to characterize hearing loss factors and simulate hearing loss consequences. A methodology was proposed to evaluate signal processing strategies for HI listeners with the proposed HLS model and NH subjects. The corresponding experiment was performed by asking NH subjects to listen to unprocessed/enhanced speech with the HLS model. Some of the effects of the algorithms seen in HI listeners are reproduced, at least qualitatively, by using the HLS model with NH listeners.Conclusions: The model-based algorithm SCS is promising for improving performance in stationary noise although no clear difference was seen in the performance of SCS and a competitive Wiener filtering algorithm. Fluctuating noise is more difficult to reduce compared to stationary noise. Noise reduction algorithms may perform better at higher input signal-to-noise ratios (SNRs) where HI listeners can get benefit but where NH listeners already reach ceiling performance. The proposed HLS model can save time and cost when evaluating noise reduction algorithms for HI listeners

    DESIGN AND EVALUATION OF HARMONIC SPEECH ENHANCEMENT AND BANDWIDTH EXTENSION

    Get PDF
    Improving the quality and intelligibility of speech signals continues to be an important topic in mobile communications and hearing aid applications. This thesis explored the possibilities of improving the quality of corrupted speech by cascading a log Minimum Mean Square Error (logMMSE) noise reduction system with a Harmonic Speech Enhancement (HSE) system. In HSE, an adaptive comb filter is deployed to harmonically filter the useful speech signal and suppress the noisy components to noise floor. A Bandwidth Extension (BWE) algorithm was applied to the enhanced speech for further improvements in speech quality. Performance of this algorithm combination was evaluated using objective speech quality metrics across a variety of noisy and reverberant environments. Results showed that the logMMSE and HSE combination enhanced the speech quality in any reverberant environment and in the presence of multi-talker babble. The objective improvements associated with the BWE were found to be minima

    Development of algorithms for smart hearing protection devices

    Get PDF
    In industrial environments, wearing hearing protection devices is required to protect the wearers from high noise levels and prevent hearing loss. In addition to their protection against excessive noise, hearing protectors block other types of signals, even if they are useful and convenient. Therefore, if people want to communicate and exchange information, they must remove their hearing protectors, which is not convenient, or even dangerous. To overcome the problems encountered with the traditional passive hearing protection devices, this thesis outlines the steps and the process followed for the development of signal processing algorithms for a hearing protector that allows protection against external noise and oral communication between wearers. This hearing protector is called the “smart hearing protection device”. The smart hearing protection device is a traditional hearing protector in which a miniature digital signal processor is embedded in order to process the incoming signals, in addition to a miniature microphone to pickup external signals and a miniature internal loudspeaker to transmit the processed signals to the protected ear. To enable oral communication without removing the smart hearing protectors, signal processing algorithms must be developed. Therefore, the objective of this thesis consists of developing a noise-robust voice activity detection algorithm and a noise reduction algorithm to improve the quality and intelligibility of the speech signal. The methodology followed for the development of the algorithms is divided into three steps: first, the speech detection and noise reduction algorithms must be developed, second, these algorithms need to be evaluated and validated in software, and third, they must be implemented in the digital signal processor to validate their feasibility for the intended application. During the development of the two algorithms, the following constraints must be taken into account: the hardware resources of the digital signal processor embedded in the hearing protector (memory, number of operations per second), and the real-time constraint since the algorithm processing time should not exceed a certain threshold not to generate a perceptible delay between the active and passive paths of the hearing protector or a delay between the lips movement and the speech perception. From a scientific perspective, the thesis determines the thresholds that the digital signal processor should not exceed to not generate a perceptible delay between the active and passive paths of the hearing protector. These thresholds were obtained from a subjective study, where it was found that this delay depends on different parameters: (a) the degree of attenuation of the hearing protector, (b) the duration of the signal, (c) the level of the background noise, and (d) the type of the background noise. This study showed that when the fit of the hearing protector is shallow, 20 % of participants begin to perceive a delay after 8 ms for a bell sound (transient), 16 ms for a clean speech signal and 22 ms for a speech signal corrupted by babble noise. On the other hand, when having a deep hearing rotection fit, it was found that the delay between the two paths is 18 ms for the bell signal, 26 ms for the speech signal without noise and no delay when speech is corrupted by babble noise, showing that a better attenuation allows more time for digital signal processing. Second, this work presents a new voice activity detection algorithm in which a low complexity speech characteristic has been extracted. This characteristic was calculated as the ratio between the signal’s energy in the frequency region that contains the first formant to characterize the speech signal, and the low or high frequencies to characterize the noise signals. The evaluation of this algorithm and its comparison to another benchmark algorithm has demonstrated its selectivity with a false positive rate averaged over three signal to noise ratios (SNR) (10, 5 and 0 dB) of 4.2 % and a true positive rate of 91.4 % compared to 29.9 % false positives and 79.0 % of true positives for the benchmark algorithm. Third, this work shows that the extraction of the temporal envelope of a signal to generate a nonlinear and adaptive gain function enables the reduction of the background noise, the improvement of the quality of the speech signal and the generation of the least musical noise compared to three other benchmark algorithms. The development of speech detection and noise reduction algorithms, their objective and subjective evaluations in different noise environments, and their implementations in digital signal processors enabled the validation of their efficiency and low complexity for the the smart hearing protection application

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Auf einem menschlichen Gehörmodell basierende Elektrodenstimulationsstrategie für Cochleaimplantate

    Get PDF
    Cochleaimplantate (CI), verbunden mit einer professionellen Rehabilitation, haben mehreren hunderttausenden Hörgeschädigten die verbale Kommunikation wieder ermöglicht. Betrachtet man jedoch die Rehabilitationserfolge, so haben CI-Systeme inzwischen ihre Grenzen erreicht. Die Tatsache, dass die meisten CI-Träger nicht in der Lage sind, Musik zu genießen oder einer Konversation in geräuschvoller Umgebung zu folgen, zeigt, dass es noch Raum für Verbesserungen gibt.Diese Dissertation stellt die neue CI-Signalverarbeitungsstrategie Stimulation based on Auditory Modeling (SAM) vor, die vollständig auf einem Computermodell des menschlichen peripheren Hörsystems beruht.Im Rahmen der vorliegenden Arbeit wurde die SAM Strategie dreifach evaluiert: mit vereinfachten Wahrnehmungsmodellen von CI-Nutzern, mit fünf CI-Nutzern, und mit 27 Normalhörenden mittels eines akustischen Modells der CI-Wahrnehmung. Die Evaluationsergebnisse wurden stets mit Ergebnissen, die durch die Verwendung der Advanced Combination Encoder (ACE) Strategie ermittelt wurden, verglichen. ACE stellt die zurzeit verbreitetste Strategie dar. Erste Simulationen zeigten, dass die Sprachverständlichkeit mit SAM genauso gut wie mit ACE ist. Weiterhin lieferte SAM genauere binaurale Merkmale, was potentiell zu einer Verbesserung der Schallquellenlokalisierungfähigkeit führen kann. Die Simulationen zeigten ebenfalls einen erhöhten Anteil an zeitlichen Pitchinformationen, welche von SAM bereitgestellt wurden. Die Ergebnisse der nachfolgenden Pilotstudie mit fünf CI-Nutzern zeigten mehrere Vorteile von SAM auf. Erstens war eine signifikante Verbesserung der Tonhöhenunterscheidung bei Sinustönen und gesungenen Vokalen zu erkennen. Zweitens bestätigten CI-Nutzer, die kontralateral mit einem Hörgerät versorgt waren, eine natürlicheren Klangeindruck. Als ein sehr bedeutender Vorteil stellte sich drittens heraus, dass sich alle Testpersonen in sehr kurzer Zeit (ca. 10 bis 30 Minuten) an SAM gewöhnen konnten. Dies ist besonders wichtig, da typischerweise Wochen oder Monate nötig sind. Tests mit Normalhörenden lieferten weitere Nachweise für die verbesserte Tonhöhenunterscheidung mit SAM.Obwohl SAM noch keine marktreife Alternative ist, versucht sie den Weg für zukünftige Strategien, die auf Gehörmodellen beruhen, zu ebnen und ist somit ein erfolgversprechender Kandidat für weitere Forschungsarbeiten.Cochlear implants (CIs) combined with professional rehabilitation have enabled several hundreds of thousands of hearing-impaired individuals to re-enter the world of verbal communication. Though very successful, current CI systems seem to have reached their peak potential. The fact that most recipients claim not to enjoy listening to music and are not capable of carrying on a conversation in noisy or reverberative environments shows that there is still room for improvement.This dissertation presents a new cochlear implant signal processing strategy called Stimulation based on Auditory Modeling (SAM), which is completely based on a computational model of the human peripheral auditory system.SAM has been evaluated through simplified models of CI listeners, with five cochlear implant users, and with 27 normal-hearing subjects using an acoustic model of CI perception. Results have always been compared to those acquired using Advanced Combination Encoder (ACE), which is today’s most prevalent CI strategy. First simulations showed that speech intelligibility of CI users fitted with SAM should be just as good as that of CI listeners fitted with ACE. Furthermore, it has been shown that SAM provides more accurate binaural cues, which can potentially enhance the sound source localization ability of bilaterally fitted implantees. Simulations have also revealed an increased amount of temporal pitch information provided by SAM. The subsequent pilot study, which ran smoothly, revealed several benefits of using SAM. First, there was a significant improvement in pitch discrimination of pure tones and sung vowels. Second, CI users fitted with a contralateral hearing aid reported a more natural sound of both speech and music. Third, all subjects were accustomed to SAM in a very short period of time (in the order of 10 to 30 minutes), which is particularly important given that a successful CI strategy change typically takes weeks to months. An additional test with 27 normal-hearing listeners using an acoustic model of CI perception delivered further evidence for improved pitch discrimination ability with SAM as compared to ACE.Although SAM is not yet a market-ready alternative, it strives to pave the way for future strategies based on auditory models and it is a promising candidate for further research and investigation
    corecore