327 research outputs found
Effects of noise suppression and envelope dynamic range compression on the intelligibility of vocoded sentences for a tonal language
Vocoder simulation studies have suggested that the carrier signal type employed affects the intelligibility of vocoded speech. The present work further assessed how carrier signal type interacts with additional signal processing, namely, single-channel noise suppression and envelope dynamic range compression, in determining the intelligibility of vocoder simulations. In Experiment 1, Mandarin sentences that had been corrupted by speech spectrum-shaped noise (SSN) or two-talker babble (2TB) were processed by one of four single-channel noise-suppression algorithms before undergoing tone-vocoded (TV) or noise-vocoded (NV) processing. In Experiment 2, dynamic ranges of multiband envelope waveforms were compressed by scaling of the mean-removed envelope waveforms with a compression factor before undergoing TV or NV processing. TV Mandarin sentences yielded higher intelligibility scores with normal-hearing (NH) listeners than did noise-vocoded sentences. The intelligibility advantage of noise-suppressed vocoded speech depended on the masker type (SSN vs 2TB). NV speech was more negatively influenced by envelope dynamic range compression than was TV speech. These findings suggest that an interactional effect exists between the carrier signal type employed in the vocoding process and envelope distortion caused by signal processing
Improving the Speech Intelligibility By Cochlear Implant Users
In this thesis, we focus on improving the intelligibility of speech for cochlear implants (CI) users. As an auditory prosthetic device, CI can restore hearing sensations for most patients with profound hearing loss in both ears in a quiet background. However, CI users still have serious problems in understanding speech in noisy and reverberant environments. Also, bandwidth limitation, missing temporal fine structures, and reduced spectral resolution due to a limited number of electrodes are other factors that raise the difficulty of hearing in noisy conditions for CI users, regardless of the type of noise. To mitigate these difficulties for CI listener, we investigate several contributing factors such as the effects of low harmonics on tone identification in natural and vocoded speech, the contribution of matched envelope dynamic range to the binaural benefits and contribution of low-frequency harmonics to tone identification in quiet and six-talker babble background. These results revealed several promising methods for improving speech intelligibility for CI patients. In addition, we investigate the benefits of voice conversion in improving speech intelligibility for CI users, which was motivated by an earlier study showing that familiarity with a talkerâs voice can improve understanding of the conversation. Research has shown that when adults are familiar with someoneâs voice, they can more accurately â and even more quickly â process and understand what the person is saying. This theory identified as the âfamiliar talker advantageâ was our motivation to examine its effect on CI patients using voice conversion technique. In the present research, we propose a new method based on multi-channel voice conversion to improve the intelligibility of transformed speeches for CI patients
Modeling Pitch Perception With an Active Auditory Model Extended by Octopus Cells
Pitch is an essential category for musical sensations. Models of pitch perception are vividly discussed up to date. Most of them rely on definitions of mathematical methods in the spectral or temporal domain. Our proposed pitch perception model is composed of an active auditory model extended by octopus cells. The active auditory model is the same as used in the Stimulation based on Auditory Modeling (SAM), a successful cochlear implant sound processing strategy extended here by modeling the functional behavior of the octopus cells in the ventral cochlear nucleus and by modeling their connections to the auditory nerve fibers (ANFs). The neurophysiological parameterization of the extended model is fully described in the time domain. The model is based on latency-phase en- and decoding as octopus cells are latency-phase rectifiers in their local receptive fields. Pitch is ubiquitously represented by cascaded firing sweeps of octopus cells. Based on the firing patterns of octopus cells, inter-spike interval histograms can be aggregated, in which the place of the global maximum is assumed to encode the pitch
Auf einem menschlichen Gehörmodell basierende Elektrodenstimulationsstrategie fĂŒr Cochleaimplantate
ï»żCochleaimplantate (CI), verbunden mit einer professionellen Rehabilitation,
haben mehreren hunderttausenden HörgeschÀdigten die verbale Kommunikation
wieder ermöglicht. Betrachtet man jedoch die Rehabilitationserfolge, so
haben CI-Systeme inzwischen ihre Grenzen erreicht. Die Tatsache, dass die
meisten CI-TrĂ€ger nicht in der Lage sind, Musik zu genieĂen oder einer
Konversation in gerÀuschvoller Umgebung zu folgen, zeigt, dass es noch Raum
fĂŒr Verbesserungen gibt.Diese Dissertation stellt die neue
CI-Signalverarbeitungsstrategie Stimulation based on Auditory Modeling
(SAM) vor, die vollstÀndig auf einem Computermodell des menschlichen
peripheren Hörsystems beruht.Im Rahmen der vorliegenden Arbeit wurde die
SAM Strategie dreifach evaluiert: mit vereinfachten Wahrnehmungsmodellen
von CI-Nutzern, mit fĂŒnf CI-Nutzern, und mit 27 Normalhörenden mittels
eines akustischen Modells der CI-Wahrnehmung. Die Evaluationsergebnisse
wurden stets mit Ergebnissen, die durch die Verwendung der Advanced
Combination Encoder (ACE) Strategie ermittelt wurden, verglichen. ACE
stellt die zurzeit verbreitetste Strategie dar. Erste Simulationen zeigten,
dass die SprachverstÀndlichkeit mit SAM genauso gut wie mit ACE ist.
Weiterhin lieferte SAM genauere binaurale Merkmale, was potentiell zu einer
Verbesserung der SchallquellenlokalisierungfĂ€higkeit fĂŒhren kann. Die
Simulationen zeigten ebenfalls einen erhöhten Anteil an zeitlichen
Pitchinformationen, welche von SAM bereitgestellt wurden. Die Ergebnisse
der nachfolgenden Pilotstudie mit fĂŒnf CI-Nutzern zeigten mehrere Vorteile
von SAM auf. Erstens war eine signifikante Verbesserung der
Tonhöhenunterscheidung bei Sinustönen und gesungenen Vokalen zu erkennen.
Zweitens bestÀtigten CI-Nutzer, die kontralateral mit einem HörgerÀt
versorgt waren, eine natĂŒrlicheren Klangeindruck. Als ein sehr bedeutender
Vorteil stellte sich drittens heraus, dass sich alle Testpersonen in sehr
kurzer Zeit (ca. 10 bis 30 Minuten) an SAM gewöhnen konnten. Dies ist
besonders wichtig, da typischerweise Wochen oder Monate nötig sind. Tests
mit Normalhörenden lieferten weitere Nachweise fĂŒr die verbesserte
Tonhöhenunterscheidung mit SAM.Obwohl SAM noch keine marktreife Alternative
ist, versucht sie den Weg fĂŒr zukĂŒnftige Strategien, die auf Gehörmodellen
beruhen, zu ebnen und ist somit ein erfolgversprechender Kandidat fĂŒr
weitere Forschungsarbeiten.Cochlear implants (CIs) combined with professional rehabilitation have
enabled several hundreds of thousands of hearing-impaired individuals to
re-enter the world of verbal communication. Though very successful, current
CI systems seem to have reached their peak potential. The fact that most
recipients claim not to enjoy listening to music and are not capable of
carrying on a conversation in noisy or reverberative environments shows
that there is still room for improvement.This dissertation presents a new
cochlear implant signal processing strategy called Stimulation based on
Auditory Modeling (SAM), which is completely based on a computational model
of the human peripheral auditory system.SAM has been evaluated through
simplified models of CI listeners, with five cochlear implant users, and
with 27 normal-hearing subjects using an acoustic model of CI perception.
Results have always been compared to those acquired using Advanced
Combination Encoder (ACE), which is todayâs most prevalent CI strategy.
First simulations showed that speech intelligibility of CI users fitted
with SAM should be just as good as that of CI listeners fitted with ACE.
Furthermore, it has been shown that SAM provides more accurate binaural
cues, which can potentially enhance the sound source localization ability
of bilaterally fitted implantees. Simulations have also revealed an
increased amount of temporal pitch information provided by SAM. The
subsequent pilot study, which ran smoothly, revealed several benefits of
using SAM. First, there was a significant improvement in pitch
discrimination of pure tones and sung vowels. Second, CI users fitted with
a contralateral hearing aid reported a more natural sound of both speech
and music. Third, all subjects were accustomed to SAM in a very short
period of time (in the order of 10 to 30 minutes), which is particularly
important given that a successful CI strategy change typically takes weeks
to months. An additional test with 27 normal-hearing listeners using an
acoustic model of CI perception delivered further evidence for improved
pitch discrimination ability with SAM as compared to ACE.Although SAM is
not yet a market-ready alternative, it strives to pave the way for future
strategies based on auditory models and it is a promising candidate for
further research and investigation
Computer-based musical interval training program for Cochlear implant users and listeners with no known hearing loss
A musical interval is the difference in pitch between two sounds. The way that musical intervals are used in melodies relative to the tonal center of a key can strongly affect the emotion conveyed by the melody. The present study examines musical interval identification in people with no known hearing loss and in cochlear implant users. Pitch resolution varies widely among cochlear implant users with average resolution an order of magnitude worse than in normal hearing. The present study considers the effect of training on musical interval identification and tests for correlations between low-level psychophysics and higher-level musical abilities. The overarching hypothesis is that cochlear implant users are limited in their ability to identify musical intervals both by low-level access to frequency cues for pitch as well as higher-level mapping of the novel encoding of pitch that implants provide. Participants completed a 2-week, online interval identification training. The benchmark tests considered before and after interval identification training were pure tone detection thresholds, pure tone frequency discrimination, fundamental frequency discrimination, tonal and rhythm comparisons, and interval identification. The results indicate strong correlations between measures of pitch resolution with interval identification; however, only a small effect of training on interval identification was observed for the cochlear implant users. Discussion focuses on improving access to pitch cues for cochlear implant users and on improving auditory training for musical intervals
The perception and production of stress and intonation by children with cochlear implants
Users of current cochlear implants have limited access to pitch information and hence to intonation in speech. This seems likely to have an important impact on prosodic
perception. This thesis examines the perception and production of the prosody of stress in children with cochlear implants. The interdependence of perceptual cues to
stress (pitch, timing and loudness) in English is well documented and each of these is considered in analyses of both perception and production. The subject group
comprised 17 implanted (CI) children aged 5;7 to 16;11 and using ACE or SPEAK processing strategies. The aims are to establish (i) the extent to which stress and intonation are conveyed to CI children in synthesised bisyllables (BAba vs. baBA) involving controlled changes in F0, duration and amplitude (Experiment I), and in natural speech involving
compound vs. phrase stress and focus (Experiment II).
(ii) when pitch cues are missing or are inaudible to the listeners, do other cues such as loudness or timing contribute to the perception of stress and intonation?
(iii) whether CI subjects make appropriate use of F0, duration and amplitude to convey linguistic focus in speech production (Experiment III).
Results of Experiment I showed that seven of the subjects were unable to reliably hear pitch differences of 0.84 octaves. Most of the remaining subjects required a large
(approx 0.5 octave) difference to reliably hear a pitch change. Performance of the CI children was poorer than that of a normal hearing group of children presented with an
acoustic cochlear implant simulation. Some of the CI children who could not discriminate F0 differences in Experiment I nevertheless scored above chance in tests
involving focus in natural speech in Experiment II. Similarly, some CI subjects who were above chance in the production of appropriate F0 contours in Experiment III
could not hear F0 differences of 0.84 octaves. These results suggest that CI children may not necessarily rely on F0 cues to stress, and in the absence of F0 or amplitude
cues, duration may provide an alternative cue
Neural Correlates of Binaural Interaction Using Aggregate-System Stimulation in Cochlear Implantees
The importance of binaural cues in auditory stream formation and sound source diïŹerentiation is widely accepted. When treating one ear with a cochlear implant (CI) the peripheral auditory system gets partially replaced and processing delays get added potentially, thus important interaural time encoding gets altered. This is a crucial problem because factors like the interaural time delay between the receiving ears are known to be responsible for facilitating such cues, e.g., sound source localization and separation. However, these eïŹects are not fully understood, leaving a lack of systematic binaural ïŹtting strategies with respect to an optimal binaural fusion.
To gain new insights into such alterations, we suggest a novel method of free-ïŹeld evoked auditory brainstem response (ABR) analysis in CI users. As a result, this method does not bypass the technically induced intrinsic delays of the hearing device while leaving the complete electrode array active, thus the most natural way of stimulation is provided and the comparable testing of real world stimuli gets facilitated. Unfortunately, ABRs acquired in CI users are additionally aïŹected by the prominent artifact caused by their electrical stimulation, which severely distorts the desired neural response, thus challenging their analysis. To circumvent this problem, we further introduce a novel narrowband ïŹltering CI artifact removal technique capable of obtaining neural correlates of ABRs in CI users. Consequently, we were able to compare brainstem-level responses collected of 12 CI users and 12 normal hearing listeners using two diïŹerent stimuli (i.e., chirp and click) at four diïŹerent intensities each, what comprises an adaption of the prominent brainstem evoked response audiometry serving as an additional evaluation criterion. We analyzed the responses using the average of 2,000 trials in combination with synchronized regularizations across them and found consistent results in their deïŹections and latencies, as well as in single trial relationships between both groups. This method provides a novel and unique perspective into the natural CI usersâ brainstem-level responses and can be practical in future research regarding binaural interaction and fusion. Furthermore, the binaural interaction component (BIC), i.e., the arithmetical diïŹerence between the sum of both monaurally evoked ABRs and the binaurally evoked ABR, has been previously shown to be an objective indicator for binaural interaction. This component is unfortunately known to be rather fragile and as a result, a reliable, objective measure of binaural interaction in CI users does not exist to the present date. It is most likely that implantees would beneïŹt from a reliable analysis of brainstem-level and subsequent higher-level binaural interaction, since this could objectively support ïŹtting strategies with respect to a maximization of interaural integration. Therefore, we introduce a novel method capable of obtaining neural correlates of binaural interaction in bimodal CI users by combining recent advances in the ïŹeld of fast, deconvolution-based ABR acquisitions with the introduced narrowband ïŹltering technique. The proposed method shows a signiïŹcant improvement in the magnitude of resulting BICs in 10 bimodal CI users and a control-group of 10 normal hearing subjects when compensating the interaural latency diïŹerence caused by the technical devices.
In total, both proposed studies objectively demonstrate technical-driven interaural latency mismatches. Thus, they strongly emphasize potential beneïŹts when balancing these interaural delays to improve binaural processing by signiïŹcant increases in associated neural correlates of successful binaural interaction. These results and also the estimated latency diïŹerences should be investigated in larger group sizes to further consolidate the results, but conïŹrm the demand for rather binaural solutions than treating hearing losses in an isolated monaural manner.Zusammenfassung
Die Notwendigkeit binauraler Verarbeitungsprozesse in der auditorischen
Wahrnehmung ist weitestgehend akzeptiert. Bei der Therapie eines Ohres
mit einem Cochlea-Implantat (engl. cochlear implant (CI)) wird das periphere
auditorische System teilweise ersetzt und verĂ€ndert, sodass natĂŒrliche, interaurale
Zeitauflösungen beeinflusst werden. Dieses Problem ist entscheidend, denn Faktoren
wie interaurale Laufzeitunterschiede zwischen den aufnehmenden Ohren sind
verantwortlich fĂŒr die Umsetzung der erwĂ€hnten binauralen Verarbeitungsprozesse,
z.B. Schallquellenlokalisation und -separation. Allerdings sind diese Effekte
nicht ausreichend verstanden, weshalb bis heute binaurale Anpassstrategien mit
RĂŒcksicht auf eine optimale Fusionierung fehlen.
Um neue Einsichten in solche zeitlichen Verzerrungen zu erhalten, schlagen
wir ein neues Verfahren der Freifeld evozierten auditorischen Hirnstammpotentiale
(engl. auditory brainstem response (ABR)) in CI-Nutzern vor. Diese Methode
beinhaltet explizit technisch-induzierte Laufzeiten verwendeter Hörhilfen,
sodass eine natĂŒrliche Stimulation unter Verwendung von realitĂ€tsnahen Stimuli
ermöglicht wird. UnglĂŒcklicherweise sind ABRs von CI-Nutzern zusĂ€tzlich mit
Stimulationsartefakten belastet, wodurch benötigte neurale Antworten weiter
verzerrt werden und eine entsprechende Analyse der Signale deutlich erschwert
wird. Um dieses Problem zu umgehen, schlagen wir eine neue Artefakt-
Reduktionstechnik vor, welche auf spektraler Schmalbandfilterung basiert und
somit den Erhalt entsprechender, neuraler ABR Korrelate ermöglicht. Diese
Methoden erlaubten die Interpretation neuraler Korrelate auf Hirnstammebene
unter Verwendung von zwei verschiedenen Stimuli (Chirps und Klicks) unter vier
verschiedenen LautstÀrken in 12 CI-Nutzern und 12 normalhörenden Probanden.
Die beschriebene Prozedur adaptiert somit die weitlÀufig bekannte Hirnstammaudiometrie
(engl. brainstem evoked response audiometry (BERA)), deren Ergebnisse
zur zusÀtzlichen Evaluation des vorgestellten Verfahrens dienten. Die Untersuchung
der aus 2000 Einzelantworten erhaltenen Mittelwerte in Kombination mit der
Analyse synchronisierter RegularitĂ€ten ĂŒber den Verlauf der Einzelantworten ergab
dabei konsistente Beobachtungen in gefundenen Amplituden, Latenzen sowie in
AbhÀngigkeiten zwischen Einzelantworten in beiden Gruppen. Das vorgestellte
Verfahren erlaubt somit auf einzigartige Weise neue und ungesehene Einsichten
in natĂŒrliche, neurale Antworten auf Hirnstammebene von CI-Nutzern, welche
in zukĂŒnftigen Studien verwendet werden können, um binaurale Interaktionen
und Fusionen weiter untersuchen zu können. Interessanterweise hat sich, die auf
ABRs basierende, binaurale Interaktionskomponente (engl. binaural interaction
component (BIC)) als objektiver Indikator binauraler Integration etabliert. Diese
Komponente (d.h. die arithmetische Differenz zwischen der Summe der monauralen
Antworten und der binauralen Antwort) ist leider sehr fragil, wodurch ein sicherer
und objektiver Nachweis in CI-Nutzern bis heute nicht existiert. Dabei ist es sehr
wahrscheinlich, dass gerade ImplantatstrÀger von einer entsprechenden Analyse
auf Hirnstammebene und höherrangigen Ebenen deutlich profitieren wĂŒrden, da
dies objektiv Anpassstrategien mit RĂŒcksicht auf eine bestmögliche binaurale
Integration ermöglichen könnte. Deshalb stellen wir ein weiteres, neuartiges
Verfahren zum Erhalt von neuralen Korrelaten binauraler Interaktion in bimodal
versorgten CI-TrĂ€gern vor, welches jĂŒngste Erfolge im Bereich der schnellen,
entfalltungsbasierten ABR Akquisition und der bereits vorgestellten Schmalband-
filterung zur Reduktion von Stimulationsartefakten kombiniert. Basierend auf
diesem Verfahren konnten signifikante Verbesserungen in der BIC-Amplitude in 10
bimodal versorgten Patienten sowie 10 normalhörenden Probanden, basierend auf
umgesetzte, interaurale Laufzeitkompensationen technischer Hörhilfen, aufgezeigt
werden.
Insgesamt demonstrieren beide vorgestellten Studien technisch-induzierte, interaurale
Laufzeitunterschiede und betonen demnach sehr deutlich potenzielle
Vorteile in assoziierten neuralen Korrelaten binauraler Interaktionen, wenn solche
MissverhÀltnisse zeitlich ausgeglichen werden. Die aufgezeigten Ergebnisse sowie
die getĂ€tigte AbschĂ€tzungen technischer Laufzeiten sollte in gröĂeren Gruppen
weiter untersucht werden, um die Aussagekraft weiter zu steigern. Dennoch
unterstreichen diese Einsichten das Verlangen nach binauralen LösungsansÀtzen
in der zukĂŒnftigen Hörrehabilitation, statt bisheriger isolierter und monauraler
Therapien
An asynchronous,low-power architecture for interleaved neural stimulation, using envelope and phase information
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 122-124).This thesis describes a low-power cochlear-implant processor chip and a charge-balanced stimulation chip that together form a complete processing-and-stimulation cochlear-implant system. The processor chip uses a novel Asynchronous Interleaved Stimulation (AIS) algorithm that preserves phase and amplitude cues in its spectral input while simultaneously minimizing electrode interactions and lowering average stimulation power per electrode. The stimulator chip obviates the need for large D.C. blocking capacitors in neural implants to achieve highly precise charge-balanced stimulation, thus lowering the size and cost of the implant. Thus, this thesis suggests that significant performance, power and cost improvements in the current generation of cochlear implants may be simultaneously possible. The 16-channel ~90 square mm AIS processor chip was built in a 1.5[mu]m VLSI process and consumed 107[mu]W of power over and above that of its analog spectral processing front end, which consumed 250gtW and which has been previously described. The AIS processor was found to faithfully mimic MATLAB implementations of the AIS algorithm. Two perceptual tests of the AIS algorithm with normal-hearing listeners verified that AIS signal reconstructions enabled better melody and speech recognition in noise than traditional envelope-only vocoder simulations of cochlear-implant processing. The average firing rate of the AIS processor was found to be significantly lower than in traditional synchronous stimulators, suggesting that the AIS algorithm and processor can potentially save power and improve hearing performance in cochlear-implant users. The stimulator chip was built in a 0.7glm high-voltage VLSI process and performed dynamic current balancing followed by a shorting phase.(cont.) It achieved <6nA of average DC current error, well below the targeted safety limit of 25nA for cochlear-implant patients. On +6 and -9V rails, the power consumption of a single channel of this chip was 47[mu]W when biasing power is shared by 16 channels. It puts out a charge-balanced stimulation pulse whenever it receives an asynchronous input signal from an AIS processor encoding phase information and 7-bit amplitude information, thus making the AIS processor chip and stimulator chip fully compatible in the cochlear-implant system. The AIS algorithm and charge-balancing circuits described in this work may be useful in other nerve-stimulation prosthetics where good fidelity in input-information encoding, minimization of electrode interactions, low-power strategies for stimulation, and compact charge-balanced stimulation are also important.by Ji-Jon Sit.Ph.D
- âŠ