16 research outputs found

    Development and Assessment of Signal Processing Algorithms for Assistive Hearing Devices

    Get PDF
    Speech identification in the presence of background noise is difficult for children with auditory processing disorder and adults with sensorineural hearing loss. The listening difficulty arises from deficits in their temporal, spectral, binaural, and/ or cognitive processing. Given the lack of improvement with conventional assistive hearing devices, alternate speech processing methodologies, which exaggerate the temporal and spectral cues, need to be developed to improve speech intelligibility for individuals who have poor temporal and/ or spectral processing. This thesis first, reports results from a series of experiments on subjective and objective assessments of two different schemes of envelope enhancement algorithms (dynamic and static) across different types and levels of background noise. The subjective results revealed that the speech intelligibility scores are lower for children with auditory processing disorder compared to children with normal hearing. The subjective results also demonstrated that enhancing the temporal envelope is much more beneficial for children with auditory processing disorder when compared to children with normal hearing. Comprehensive objective assessments, which were conducted by developing novel intrusive and non-intrusive objective speech intelligibility predictors, demonstrated that both dynamic and static envelope enhancement algorithms are only effective in improving speech intelligibility under certain processing conditions that depended on the type, level and location of the background noise. Furthermore, the application of noise reduction algorithms prior to the envelope enhancement techniques increased their range of effectiveness. Second, using the proposed objective predictors, the effectiveness of a companding architecture (which enhances both temporal and spectral cues) is shown to be better than temporal envelope enhancement alone, across different noisy environments in the presence of a noise reduction algorithm. Third, the application of the binaural dichotic processing is evaluated in stationary and non-stationary background noise environments through subjective experiments. The subjective results demonstrated that the dichotic processing is mainly effective in improving speech intelligibility for stationary background noise at poor signal to noise ratios. It is also shown that the incorporation of a noise reduction algorithm as a front-end to the dichotic hearing processing is inferior to increase its range of effectiveness regardless of the type and level of the background noise

    Biophysical modeling of a cochlear implant system: progress on closed-loop design using a novel patient-specific evaluation platform

    Get PDF
    The modern cochlear implant is one of the most successful neural stimulation devices, which partially mimics the workings of the auditory periphery. In the last few decades it has created a paradigm shift in hearing restoration of the deaf population, which has led to more than 324,000 cochlear implant users today. Despite its great success there is great disparity in patient outcomes without clear understanding of the aetiology of this variance in implant performance. Furthermore speech recognition in adverse conditions or music appreciation is still not attainable with today's commercial technology. This motivates the research for the next generation of cochlear implants that takes advantage of recent developments in electronics, neuroscience, nanotechnology, micro-mechanics, polymer chemistry and molecular biology to deliver high fidelity sound. The main difficulties in determining the root of the problem in the cases where the cochlear implant does not perform well are two fold: first there is not a clear paradigm on how the electrical stimulation is perceived as sound by the brain, and second there is limited understanding on the plasticity effects, or learning, of the brain in response to electrical stimulation. These significant knowledge limitations impede the design of novel cochlear implant technologies, as the technical specifications that can lead to better performing implants remain undefined. The motivation of the work presented in this thesis is to compare and contrast the cochlear implant neural stimulation with the operation of the physiological healthy auditory periphery up to the level of the auditory nerve. As such design of novel cochlear implant systems can become feasible by gaining insight on the question `how well does a specific cochlear implant system approximate the healthy auditory periphery?' circumventing the necessity of complete understanding of the brain's comprehension of patterned electrical stimulation delivered from a generic cochlear implant device. A computational model, termed Digital Cochlea Stimulation and Evaluation Tool (‘DiCoStET’) has been developed to provide an objective estimate of cochlear implant performance based on neuronal activation measures, such as vector strength and average activation. A patient-specific cochlea 3D geometry is generated using a model derived by a single anatomical measurement from a patient, using non-invasive high resolution computed tomography (HRCT), and anatomically invariant human metrics and relations. Human measurements of the neuron route within the inner ear enable an innervation pattern to be modelled which joins the space from the organ of Corti to the spiral ganglion subsequently descending into the auditory nerve bundle. An electrode is inserted in the cochlea at a depth that is determined by the user of the tool. The geometric relation between the stimulation sites on the electrode and the spiral ganglion are used to estimate an activating function that will be unique for the specific patient's cochlear shape and electrode placement. This `transfer function', so to speak, between electrode and spiral ganglion serves as a `digital patient' for validating novel cochlear implant systems. The novel computational tool is intended for use by bioengineers, surgeons, audiologists and neuroscientists alike. In addition to ‘DiCoStET’ a second computational model is presented in this thesis aiming at enhancing the understanding of the physiological mechanisms of hearing, specifically the workings of the auditory synapse. The purpose of this model is to provide insight on the sound encoding mechanisms of the synapse. A hypothetical mechanism is suggested in the release of neurotransmitter vesicles that permits the auditory synapse to encode temporal patterns of sound separately from sound intensity. DiCoStET was used to examine the performance of two different types of filters used for spectral analysis in the cochlear implant system, the Gammatone type filter and the Butterworth type filter. The model outputs suggest that the Gammatone type filter performs better than the Butterworth type filter. Furthermore two stimulation strategies, the Continuous Interleaved Stimulation (CIS) and Asynchronous Interleaved Stimulation (AIS) have been compared. The estimated neuronal stimulation spatiotemporal patterns for each strategy suggest that the overall stimulation pattern is not greatly affected by the temporal sequence change. However the finer detail of neuronal activation is different between the two strategies, and when compared to healthy neuronal activation patterns the conjecture is made that the sequential stimulation of CIS hinders the transmission of sound fine structure information to the brain. The effect of the two models developed is the feasibility of collaborative work emanating from various disciplines; especially electrical engineering, auditory physiology and neuroscience for the development of novel cochlear implant systems. This is achieved by using the concept of a `digital patient' whose artificial neuronal activation is compared to a healthy scenario in a computationally efficient manner to allow practical simulation times.Open Acces

    Modeling and design of an active silicon cochlea

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references.Silicon cochleas are inspired by the biological cochlea and perform efficient spectrum analysis: They realize a bank of constant-Q Nth-order filters with O(N) efficiency rather than O(NÂČ) efficiency due to their use of an exponentially tapered filter cascade. They are useful in speech-recognition front ends, cochlear implants, and hearing aids, especially as architectures for improving spectral analysis in noisy environments and for performing low-power spectrum analysis. In this thesis I describe four contributions towards improving the state-of-the-art in silicon-cochlea design, two of which involve theoretical modeling, and two of which involve integrated-circuit design. On the theoretical side, I first show that a simple rational approximation to distributed partition impedances in the biological cochlea captures its essential features and enables an efficient artificial implementation achieving maximum gain in a minimum number of stages while still maintaining stability. In particular, I show that the terminating impedance of the cochlea is crucial for its stability and discuss various analytic methods for termination. Second, I derive a novel composite artificial cochlear architecture composed of a cascade of all-pass second-order filters from a first-principles analysis of the biological cochlear transmission line. The novel all-pass architecture reduces phase lag and group delay in the silicon cochlea, a problem in prior designs, sharpens its high-frequency rolloff slopes, increases its frequency selectivity, and improves its nonlinear compression characteristics. On the circuit side, I first present a novel current-mode log-domain topology that simultaneously increases signal-to-noise ratio (SNR) and dynamic range while lowering power consumption in resonant filters with high quality factor Q.(cont.) The novel topology is validated in a second-order low-pass resonant filter, which is employed in the silicon cochlea, demonstrating a reduction in power consumption and increase in SNR by a factor of Q. When bias currents in the filter are adjusted as the signal level varies, this technique enables an improvement in maximum SNR by a factor of Q and an increase in maximum non-distorted signal power and dynamic range by a factor of Q⁎. Measurements from a chip in a 0.18-[mu]m 1.1-V CMOS technology achieve a quiescent power consumption of 580-nW at a 15-kHz center frequency with a maximum SNR of 41.3dB and dynamic range of 76dB for a Q=4. Finally, I describe a current-mode -stage 0.18-[mu]m silicon cochlea that achieves 79dB of dynamic range with 41-[mu]W power consumption on a 1-V power supply over a usable 3.5kHz-14kHz frequency range. These numbers represent an 18dB improvement in dynamic range and a 12.5x reduction in power consumption over prior state-of-the-art silicon cochleas.by Serhii M. Zhak.Ph.D

    Speech coding at medium bit rates using analysis by synthesis techniques

    Get PDF
    Speech coding at medium bit rates using analysis by synthesis technique

    Iterative Separation of Note Events from Single-Channel Polyphonic Recordings

    Get PDF
    This thesis is concerned with the separation of audio sources from single-channel polyphonic musical recordings using the iterative estimation and separation of note events. Each event is defined as a section of audio containing largely harmonic energy identified as coming from a single sound source. Multiple events can be clustered to form separated sources. This solution is a model-based algorithm that can be applied to a large variety of audio recordings without requiring previous training stages. The proposed system embraces two principal stages. The first one considers the iterative detection and separation of note events from within the input mixture. In every iteration, the pitch trajectory of the predominant note event is automatically selected from an array of fundamental frequency estimates and used to guide the separation of the event's spectral content using two different methods: time-frequency masking and time-domain subtraction. A residual signal is then generated and used as the input mixture for the next iteration. After convergence, the second stage considers the clustering of all detected note events into individual audio sources. Performance evaluation is carried out at three different levels. Firstly, the accuracy of the note-event-based multipitch estimator is compared with that of the baseline algorithm used in every iteration to generate the initial set of pitch estimates. Secondly, the performance of the semi-supervised source separation process is compared with that of another semi-automatic algorithm. Finally, a listening test is conducted to assess the audio quality and naturalness of the separated sources when they are used to create stereo mixes from monaural recordings. Future directions for this research focus on the application of the proposed system to other music-related tasks. Also, a preliminary optimisation-based approach is presented as an alternative method for the separation of overlapping partials, and as a high resolution time-frequency representation for digital signals

    BeitrÀge zu breitbandigen Freisprechsystemen und ihrer Evaluation

    Get PDF
    This work deals with the advancement of wideband hands-free systems (HFS’s) for mono- and stereophonic cases of application. Furthermore, innovative contributions to the corr. field of quality evaluation are made. The proposed HFS approaches are based on frequency-domain adaptive filtering for system identification, making use of Kalman theory and state-space modeling. Functional enhancement modules are developed in this work, which improve one or more of key quality aspects, aiming at not to harm others. In so doing, these modules can be combined in a flexible way, dependent on the needs at hand. The enhanced monophonic HFS is evaluated according to automotive ITU-T recommendations, to prove its customized efficacy. Furthermore, a novel methodology and techn. framework are introduced in this work to improve the prototyping and evaluation process of automotive HF and in-car-communication (ICC) systems. The monophonic HFS in several configurations hereby acts as device under test (DUT) and is thoroughly investigated, which will show the DUT’s satisfying performance, as well as the advantages of the proposed development process. As current methods for the evaluation of HFS’s in dynamic conditions oftentimes still lack flexibility, reproducibility, and accuracy, this work introduces “Car in a Box” (CiaB) as a novel, improved system for this demanding task. It is able to enhance the development process by performing high-resolution system identification of dynamic electro-acoustical systems. The extracted dyn. impulse response trajectories are then applicable to arbitrary input signals in a synthesis operation. A realistic dynamic automotive auralization of a car cabin interior is available for HFS evaluation. It is shown that this system improves evaluation flexibility at guaranteed reproducibility. In addition, the accuracy of evaluation methods can be increased by having access to exact, realistic imp. resp. trajectories acting as a so-called “ground truth” reference. If CiaB is included into an automotive evaluation setup, there is no need for an acoustical car interior prototype to be present at this stage of development. Hency, CiaB may ease the HFS development process. Dynamic acoustic replicas may be provided including an arbitrary number of acoustic car cabin interiors for multiple developers simultaneously. With CiaB, speech enh. system developers therefore have an evaluation environment at hand, which can adequately replace the real environment.Diese Arbeit beschĂ€ftigt sich mit der Weiterentwicklung breitbandiger Freisprechsysteme fĂŒr mono-/stereophone AnwendungsfĂ€lle und liefert innovative BeitrĂ€ge zu deren QualitĂ€tsmessung. Die vorgestellten Verfahren basieren auf im Frequenzbereich adaptierenden Algorithmen zur Systemidentifikation gemĂ€ĂŸ Kalman-Theorie in einer Zustandsraumdarstellung. Es werden funktionale Erweiterungsmodule dahingehend entwickelt, dass mindestens eine QualitĂ€tsanforderung verbessert wird, ohne andere eklatant zu verletzen. Diese nach Anforderung flexibel kombinierbaren algorithmischen Erweiterungen werden gemĂ€ĂŸ Empfehlungen der ITU-T (Rec. P.1110/P.1130) in vorwiegend automotiven Testszenarien getestet und somit deren zielgerichtete Wirksamkeit bestĂ€tigt. Es wird eine Methodensammlung und ein technisches System zur verbesserten Prototypentwicklung/Evaluation von automotiven Freisprech- und Innenraumkommunikationssystemen vorgestellt und beispielhaft mit dem monophonen Freisprechsystem in diversen Ausbaustufen zur Anwendung gebracht. Daraus entstehende Vorteile im Entwicklungs- und Testprozess von Sprachverbesserungssystem werden dargelegt und messtechnisch verifiziert. Bestehende Messverfahren zum Verhalten von Freisprechsystemen in zeitvarianten Umgebungen zeigten bisher oft nur ein unzureichendes Maß an FlexibilitĂ€t, Reproduzierbarkeit und Genauigkeit. Daher wird hier das „Car in a Box“-Verfahren (CiaB) entwickelt und vorgestellt, mit dem zeitvariante elektro-akustische Systeme technisch identifiziert werden können. So gewonnene dynamische Impulsantworten können im Labor in einer Syntheseoperation auf beliebige Eingangsignale angewandt werden, um realistische Testsignale unter dyn. Bedingungen zu erzeugen. Bei diesem Vorgehen wird ein hohes Maß an FlexibilitĂ€t bei garantierter Reproduzierbarkeit erlangt. Es wird gezeigt, dass die Genauigkeit von darauf basierenden Evaluationsverfahren zudem gesteigert werden kann, da mit dem Vorliegen von exakten, realen Impulsantworten zu jedem Zeitpunkt der Messung eine sogenannte „ground truth“ als Referenz zur VerfĂŒgung steht. Bei der Einbindung von CiaB in einen Messaufbau fĂŒr automotive Freisprechsysteme ist es bedeutsam, dass zu diesem Zeitpunkt das eigentliche Fahrzeug nicht mehr benötigt wird. Es wird gezeigt, dass eine dyn. Fahrzeugakustikumgebung, wie sie im Entwicklungsprozess von automotiven Sprachverbesserungsalgorithmen benötigt wird, in beliebiger Anzahl vollstĂ€ndig und mind. gleichwertig durch CiaB ersetzt werden kann

    Queensland University of Technology: Handbook 1994

    Get PDF
    The Queensland University of Technology handbook gives an outline of the faculties and subject offerings available that were offered by QUT
    corecore