941 research outputs found

    Determination and evaluation of clinically efficient stopping criteria for the multiple auditory steady-state response technique

    Get PDF
    Background: Although the auditory steady-state response (ASSR) technique utilizes objective statistical detection algorithms to estimate behavioural hearing thresholds, the audiologist still has to decide when to terminate ASSR recordings introducing once more a certain degree of subjectivity. Aims: The present study aimed at establishing clinically efficient stopping criteria for a multiple 80-Hz ASSR system. Methods: In Experiment 1, data of 31 normal hearing subjects were analyzed off-line to propose stopping rules. Consequently, ASSR recordings will be stopped when (1) all 8 responses reach significance and significance can be maintained for 8 consecutive sweeps; (2) the mean noise levels were ≤ 4 nV (if at this “≤ 4-nV” criterion, p-values were between 0.05 and 0.1, measurements were extended only once by 8 sweeps); and (3) a maximum amount of 48 sweeps was attained. In Experiment 2, these stopping criteria were applied on 10 normal hearing and 10 hearing-impaired adults to asses the efficiency. Results: The application of these stopping rules resulted in ASSR threshold values that were comparable to other multiple-ASSR research with normal hearing and hearing-impaired adults. Furthermore, in 80% of the cases, ASSR thresholds could be obtained within a time-frame of 1 hour. Investigating the significant response-amplitudes of the hearing-impaired adults through cumulative curves indicated that probably a higher noise-stop criterion than “≤ 4 nV” can be used. Conclusions: The proposed stopping rules can be used in adults to determine accurate ASSR thresholds within an acceptable time-frame of about 1 hour. However, additional research with infants and adults with varying degrees and configurations of hearing loss is needed to optimize these criteria

    Effects of noise suppression and envelope dynamic range compression on the intelligibility of vocoded sentences for a tonal language

    Get PDF
    Vocoder simulation studies have suggested that the carrier signal type employed affects the intelligibility of vocoded speech. The present work further assessed how carrier signal type interacts with additional signal processing, namely, single-channel noise suppression and envelope dynamic range compression, in determining the intelligibility of vocoder simulations. In Experiment 1, Mandarin sentences that had been corrupted by speech spectrum-shaped noise (SSN) or two-talker babble (2TB) were processed by one of four single-channel noise-suppression algorithms before undergoing tone-vocoded (TV) or noise-vocoded (NV) processing. In Experiment 2, dynamic ranges of multiband envelope waveforms were compressed by scaling of the mean-removed envelope waveforms with a compression factor before undergoing TV or NV processing. TV Mandarin sentences yielded higher intelligibility scores with normal-hearing (NH) listeners than did noise-vocoded sentences. The intelligibility advantage of noise-suppressed vocoded speech depended on the masker type (SSN vs 2TB). NV speech was more negatively influenced by envelope dynamic range compression than was TV speech. These findings suggest that an interactional effect exists between the carrier signal type employed in the vocoding process and envelope distortion caused by signal processing

    Coding Strategies for Cochlear Implants Under Adverse Environments

    Get PDF
    Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise

    A psychoacoustic "NofM"-type speech coding strategy for cochlear implants

    Get PDF
    We describe a new signal processing technique for cochlear implants using a psychoacoustic-masking model. The technique is based on the principle of a so-called "NofM" strategy. These strategies stimulate fewer channels (N) per cycle than active electrodes (NofM; N < M). In "NofM" strategies such as ACE or SPEAK, only the N channels with higher amplitudes are stimulated. The new strategy is based on the ACE strategy but uses a psychoacoustic-masking model in order to determine the essential components of any given audio signal. This new strategy was tested on device users in an acute Study, with either 4 or 8 channels stimulated per cycle. For the first condition (4 channels), the mean improvement over the ACE strategy was 17%. For the second condition (8 channels), no significant difference was found between the two strategies

    EFFECTS OF AGING ON VOICE-PITCH PROCESSING: THE ROLE OF SPECTRAL AND TEMPORAL CUES

    Get PDF
    Declines in auditory temporal processing are a common consequence of natural aging. Interactions between aging and spectro-temporal pitch processing have yet to be thoroughly investigated in humans, though recent neurophysiologic and electrophysiologic data lend support to the notion that periodicity coding using only unresolved harmonics (i.e., those available via the temporal envelope) is negatively affected as a consequence of age. Individuals with cochlear implants (CIs) must rely on the temporal envelope of speech to glean information about voice pitch [coded through the fundamental frequency (f0)], as spectral f0 cues are not available. While cochlear implants have been shown to be efficacious in older adults, it is hypothesized that they would experience difficulty perceiving spectrally-degraded voice-pitch information. The current experiments were aimed at quantifying the ability of younger and older listeners to utilize spectro-temporal cues to obtain voice pitch information when performing simple and complex auditory tasks. Experiment 1 measured the ability of younger and older NH listeners to perceive a difference in the frequency of amplitude modulated broad-band noise, thereby exploiting only temporal envelope cues to perform the task. Experiment 2 measured age-related differences in f0 difference limens as the degree of spectral degradation was manipulated to approximate CI processing. Results from Experiments 1 and 2 demonstrated that spectro-temporal processing of f0 information in non-speech stimuli is affected in older adults. Experiment 3 showed that age-related performances observed in Experiments 1 and 2 translated to voice gender identification using a natural speech stimulus. Experiment 4 attempted to estimate how younger and older NH listeners are able to utilize differences in voice pitch information in everyday listening environments (i.e., speech in noise) and how such abilities are affected by spectral degradation. Comprehensive results provide further insight on pitch coding in both normal and impaired auditory systems, and demonstrate that spectro-temporal pitch processing is dependent upon the age of the listener. Results could have important implications for elderly cochlear implant recipients

    Temporal Filterbanks in Cochlear Implant Hearing and Deep Learning Simulations

    Get PDF
    The masking phenomenon has been used to investigate cochlear excitation patterns and has even motivated audio coding formats for compression and speech processing. For example, cochlear implants rely on masking estimates to filter incoming sound signals onto an array. Historically, the critical band theory has been the mainstay of psychoacoustic theory. However, masked threshold shifts in cochlear implant users show a discrepancy between the observed critical bandwidths, suggesting separate roles for place location and temporal firing patterns. In this chapter, we will compare discrimination tasks in the spectral domain (e.g., power spectrum models) and the temporal domain (e.g., temporal envelope) to introduce new concepts such as profile analysis, temporal critical bands, and transition bandwidths. These recent findings violate the fundamental assumptions of the critical band theory and could explain why the masking curves of cochlear implant users display spatial and temporal characteristics that are quite unlike that of acoustic stimulation. To provide further insight, we also describe a novel analytic tool based on deep neural networks. This deep learning system can simulate many aspects of the auditory system, and will be used to compute the efficiency of spectral filterbanks (referred to as “FBANK”) and temporal filterbanks (referred to as “TBANK”)

    Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing

    Get PDF
    otorhinolaryngology; neurosciences; hearin

    Using blind source separation techniques to improve speech recognition in bilateral cochlear implant patients

    Get PDF
    This is the published version, also available here: http://dx.doi.org/10.1121/1.2839887.Bilateral cochlear implants seek to restore the advantages of binaural hearing by improving access to binaural cues. Bilateral implant users are currently fitted with two processors, one in each ear, operating independent of one another. In this work, a different approach to bilateral processing is explored based on blind source separation (BSS) by utilizing two implants driven by a single processor. Sentences corrupted by interfering speech or speech-shaped noise are presented to bilateral cochlear implant users at 0dB signal-to-noise ratio in order to evaluate the performance of the proposed BSS method. Subjects are tested in both anechoic and reverberant settings, wherein the target and masker signals are spatially separated. Results indicate substantial improvements in performance in both anechoic and reverberant settings over the subjects’ daily strategies for both masker conditions and at various locations of the masker. It is speculated that such improvements are due to the fact that the proposed BSS algorithm capitalizes on the variations of interaural level differences and interaural time delays present in the mixtures of the signals received by the two microphones, and exploits that information to spatially separate the target from the masker signals
    • …
    corecore