Search CORE

1,157 research outputs found

Analogue CMOS Cochlea Systems: A Historic Retrospective

Author: Andreas Katsiamis
Emmanuel Drakakis
Publication venue: 'IntechOpen'
Publication date: 26/04/2011
Field of study

Temporal Filterbanks in Cochlear Implant Hearing and Deep Learning Simulations

Author: Lin Payton
Publication venue: 'IntechOpen'
Publication date: 29/03/2017
Field of study

The masking phenomenon has been used to investigate cochlear excitation patterns and has even motivated audio coding formats for compression and speech processing. For example, cochlear implants rely on masking estimates to filter incoming sound signals onto an array. Historically, the critical band theory has been the mainstay of psychoacoustic theory. However, masked threshold shifts in cochlear implant users show a discrepancy between the observed critical bandwidths, suggesting separate roles for place location and temporal firing patterns. In this chapter, we will compare discrimination tasks in the spectral domain (e.g., power spectrum models) and the temporal domain (e.g., temporal envelope) to introduce new concepts such as profile analysis, temporal critical bands, and transition bandwidths. These recent findings violate the fundamental assumptions of the critical band theory and could explain why the masking curves of cochlear implant users display spatial and temporal characteristics that are quite unlike that of acoustic stimulation. To provide further insight, we also describe a novel analytic tool based on deep neural networks. This deep learning system can simulate many aspects of the auditory system, and will be used to compute the efficiency of spectral filterbanks (referred to as “FBANK”) and temporal filterbanks (referred to as “TBANK”)

IntechOpen

Recommended from our members

Sensorineural hearing loss enhances auditory sensitivity and temporal integration for amplitude modulation.

Author: Ewert Stephan D
Lorenzi Christian
Moore Brian CJ
Wallaert Nicolas
Publication venue: Journal of the Acoustical Society of America
Publication date: 22/02/2017
Field of study

Amplitude-modulation detection thresholds (AMDTs) were measured at 40 dB sensation level for listeners with mild-to-moderate sensorineural hearing loss (age: 50-64 yr) for a carrier frequency of 500 Hz and rates of 2 and 20 Hz. The number of modulation cycles, N, varied between two and nine. The data were compared with AMDTs measured for young and older normal-hearing listeners [Wallaert, Moore, and Lorenzi (2016). J. Acoust. Soc. Am. 139, 3088-3096]. As for normal-hearing listeners, AMDTs were lower for the 2-Hz than for the 20-Hz rate, and AMDTs decreased with increasing N. AMDTs were lower for hearing-impaired listeners than for normal-hearing listeners, and the effect of increasing N was greater for hearing-impaired listeners. A computational model based on the modulation-filterbank concept and a template-matching decision strategy was developed to account for the data. The psychophysical and simulation data suggest that the loss of amplitude compression in the impaired cochlea is mainly responsible for the enhanced sensitivity and temporal integration of temporal envelope cues found for hearing-impaired listeners. The data also suggest that, for AM detection, cochlear damage is associated with increased internal noise, but preserved short-term memory and decision mechanisms.N.W. was supported by a grant from Neurelec Oticon Medical. C.L. was supported by two grants from ANR (HEARFIN and HEART projects). S.D.E. was supported by Deutsche Forschungsgemeinschaft (DFG) FOR 1732 (TPE). B.C.J.M. was supported by the EPSRC (UK, grant RG78536). This work was also supported by ANR-11-0001-02 PSL* and ANR-10-LABX-0087. We thank Nihaad Paraouty and two anonymous reviewers for helpful comments and suggestions relating to this study

Apollo (Cambridge)

A comparative study of eight human auditory models of monaural processing

Author: Bruce Ian C.
Carney Laurel H.
Dau Torsten
Majdak Piotr
Varnet Léo
Vecchi Alejandro Osses
Verhulst Sarah
Publication venue: 'EDP Sciences'
Publication date: 01/01/2022
Field of study

A number of auditory models have been developed using diverging approaches, either physiological or perceptual, but they share comparable stages of signal processing, as they are inspired by the same constitutive parts of the auditory system. We compare eight monaural models that are openly accessible in the Auditory Modelling Toolbox. We discuss the considerations required to make the model outputs comparable to each other, as well as the results for the following model processing stages or their equivalents: Outer and middle ear, cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear nucleus, and inferior colliculus. The discussion includes a list of recommendations for future applications of auditory models.Comment: Revision 1 of the manuscrip

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Ghent University Academic Bibliography

PubMed Central

Online Research Database In Technology

Methods of Optimizing Speech Enhancement for Hearing Applications

Author: Liu Fangqi
Publication venue: UCL (University College London)
Publication date: 28/09/2019
Field of study

Speech intelligibility in hearing applications suffers from background noise. One of the most effective solutions is to develop speech enhancement algorithms based on the biological traits of the auditory system. In humans, the medial olivocochlear (MOC) reflex, which is an auditory neural feedback loop, increases signal-in-noise detection by suppressing cochlear response to noise. The time constant is one of the key attributes of the MOC reflex as it regulates the variation of suppression over time. Different time constants have been measured in nonhuman mammalian and human auditory systems. Physiological studies reported that the time constant of nonhuman mammalian MOC reflex varies with the properties (e.g. frequency, bandwidth) changes of the stimulation. A human based study suggests that time constant could vary when the bandwidth of the noise is changed. Previous works have developed MOC reflex models and successfully demonstrated the benefits of simulating the MOC reflex for speech-in-noise recognition. However, they often used fixed time constants. The effect of the different time constants on speech perception remains unclear. The main objectives of the present study are (1) to study the effect of the MOC reflex time constant on speech perception in different noise conditions; (2) to develop a speech enhancement algorithm with dynamic time constant optimization to adapt to varying noise conditions for improving speech intelligibility. The first part of this thesis studies the effect of the MOC reflex time constants on speech-in-noise perception. Conventional studies do not consider the relationship between the time constants and speech perception as it is difficult to measure the speech intelligibility changes due to varying time constants in human subjects. We use a model to investigate the relationship by incorporating Meddis’ peripheral auditory model (which includes a MOC reflex) with an automatic speech recognition (ASR) system. The effect of the MOC reflex time constant is studied by adjusting the time constant parameter of the model and testing the speech recognition accuracy of the ASR. Different time constants derived from human data are evaluated in both speech-like and non-speech like noise at the SNR levels from -10 dB to 20 dB and clean speech condition. The results show that the long time constants (≥1000 ms) provide a greater improvement of speech recognition accuracy at SNR levels≤10 dB. Maximum accuracy improvement of 40% (compared to no MOC condition) is shown in pink noise at the SNR of 10 dB. Short time constants (<1000 ms) show recognition accuracy over 5% higher than the longer ones at SNR levels ≥15 dB. The second part of the thesis develops a novel speech enhancement algorithm based on the MOC reflex with a time constant that is dynamically optimized, according to a lookup table for varying SNRs. The main contributions of this part include: (1) So far, the existing SNR estimation methods are challenged in cases of low SNR, nonstationary noise, and computational complexity. High computational complexity would increase processing delay that causes intelligibility degradation. A variance of spectral entropy (VSE) based SNR estimation method is developed as entropy based features have been shown to be more robust in the cases of low SNR and nonstationary noise. The SNR is estimated according to the estimated VSE-SNR relationship functions by measuring VSE of noisy speech. Our proposed method has an accuracy of 5 dB higher than other methods especially in the babble noise with fewer talkers (2 talkers) and low SNR levels (< 0 dB), with averaging processing time only about 30% of the noise power estimation based method. The proposed SNR estimation method is further improved by implementing a nonlinear filter-bank. The compression of the nonlinear filter-bank is shown to increase the stability of the relationship functions. As a result, the accuracy is improved by up to 2 dB in all types of tested noise. (2) A modification of Meddis’ MOC reflex model with a time constant dynamically optimized against varying SNRs is developed. The model incudes simulated inner hair cell response to reduce the model complexity, and now includes the SNR estimation method. Previous MOC reflex models often have fixed time constants that do not adapt to varying noise conditions, whilst our modified MOC reflex model has a time constant dynamically optimized according to the estimated SNRs. The results show a speech recognition accuracy of 8 % higher than the model using a fixed time constant of 2000 ms in different types of noise. (3) A speech enhancement algorithm is developed based on the modified MOC reflex model and implemented in an existing hearing aid system. The performance is evaluated by measuring the objective speech intelligibility metric of processed noisy speech. In different types of noise, the proposed algorithm increases intelligibility at least 20% in comparison to unprocessed noisy speech at SNRs between 0 dB and 20 dB, and over 15 % in comparison to processed noisy speech using the original MOC based algorithm in the hearing aid

UCL Discovery

Idealized computational models for auditory receptive fields

Author: Friberg Anders
Lindeberg Tony
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

This paper presents a theory by which idealized models of auditory receptive fields can be derived in a principled axiomatic manner, from a set of structural properties to enable invariance of receptive field responses under natural sound transformations and ensure internal consistency between spectro-temporal receptive fields at different temporal and spectral scales. For defining a time-frequency transformation of a purely temporal sound signal, it is shown that the framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters, with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal temporal window functions. When applied to the definition of a second-layer of receptive fields from a spectrogram, it is shown that the framework leads to two canonical families of spectro-temporal receptive fields, in terms of spectro-temporal derivatives of either spectro-temporal Gaussian kernels for non-causal time or the combination of a time-causal generalized Gammatone filter over the temporal domain and a Gaussian filter over the logspectral domain. For each filter family, the spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Within each domain of either non-causal or time-causal time, these receptive field families are derived by uniqueness from the assumptions. It is demonstrated how the presented framework allows for computation of basic auditory features for audio processing and that it leads to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields measured in the inferior colliculus (ICC) and primary auditory cortex (A1) of mammals.Comment: 55 pages, 22 figures, 3 table

arXiv.org e-Print Archive

Publikationer från KTH

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Communications Biophysics

Author: Boduch Raymond
Braida Louis D.
Bustamante Diane K.
Coker Jackie
Colburn H. Steven
Delhorne Lorraine A.
DeRosier David J.
Dowdy Leonard C.
Downs Maralene M.
Durlach Nathaniel I.
Farrar Catherine L.
Florentine Mary S.
Foss Kristin K.
Freeman Dennis M.
Frishkopf Lawrence S.
Gabriel Kaigham J.
Gilbert Eric
Ito Yoshiko
Jain Manoj
Kiang Nelson Y-S.
Koehnke Janet A.
Leivy Sander J.
Leotta Daniel F.
Macmillan Neil A.
Oman Charles M.
Opalsky David
Peake William T.
Pemberton Joseph C.
Peterson Patrick M.
Posen Miles P.
Rabinowitz William M.
Reed Charlotte M.
Reid Jean P.
Reohr Richard D.
Rohlicek J. Robin
Russell Roy P., Jr.
Schultz Martin C.
Siebert William M.
Silletto Karen
Skarda Gregory M.
Tsuk Michael J.
Uchanski Rosalie M.
Waissman Roberto G.
Weiss Thomas F.
Zue Victor W.
Zurek Patrick M.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 01/01/1984
Field of study

Contains reports on seven research projects split into three sections.National Institutes of Health (Grant 5 PO1 NS13126)National Institutes of Health (Grant 1 RO1 NS18682)National Institutes of Health (Training Grant 5 T32 NS07047)National Science Foundation (Grant BNS77-16861)National Institutes of Health (Grant 1 F33 NS07202-01)National Institutes of Health (Grant 5 RO1 NS10916)National Institutes of Health (Grant 5 RO1 NS12846)National Institutes of Health (Grant 1 RO1 NS16917)National Institutes of Health (Grant 1 RO1 NS14092-05)National Science Foundation (Grant BNS 77 21751)National Institutes of Health (Grant 5 R01 NS11080)National Institutes of Health (Grant GM-21189

DSpace@MIT

Effects of Hearing Aid Amplification on Robust Neural Coding of Speech

Author: Boley Jonathan Daniel
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2013
Field of study

Hearing aids are able to restore some hearing abilities for people with auditory impairments, but background noise remains a significant problem. Unfortunately, we know very little about how speech is encoded in the auditory system, particularly in impaired systems with prosthetic amplifiers. There is growing evidence that relative timing in the neural signals (known as spatiotemporal coding) is important for speech perception, but there is little research that relates spatiotemporal coding and hearing aid amplification. This research uses a combination of computational modeling and physiological experiments to characterize how hearing aids affect vowel coding in noise at the level of the auditory nerve. The results indicate that sensorineural hearing impairment degrades the temporal cues transmitted from the ear to the brain. Two hearing aid strategies (linear gain and wide dynamic-range compression) were used to amplify the acoustic signal. Although appropriate gain was shown to improve temporal coding for individual auditory nerve fibers, neither strategy improved spatiotemporal cues. Previous work has attempted to correct the relative timing by adding frequency-dependent delays to the acoustic signal (e.g., within a hearing aid). We show that, although this strategy can affect the timing of auditory nerve responses, it is unlikely to improve the relative timing as intended. We have shown that existing hearing aid technologies do not improve some of the neural cues that we think are important for perception, but it is important to understand these limitations. Our hope is that this knowledge can be used to develop new technologies to improve auditory perception in difficult acoustic environments

Purdue E-Pubs