80 research outputs found

    Real-time detection of auditory : steady-state brainstem potentials evoked by auditory stimuli

    Get PDF
    The auditory steady-state response (ASSR) is advantageous against other hearing techniques because of its capability in providing objective and frequency specific information. The objectives are to reduce the lengthy test duration, and improve the signal detection rate and the robustness of the detection against the background noise and unwanted artefacts.Two prominent state estimation techniques of Luenberger observer and Kalman filter have been used in the development of the autonomous ASSR detection scheme. Both techniques are real-time implementable, while the challenges faced in the application of the observer and Kalman filter techniques are the very poor SNR (could be as low as −30dB) of ASSRs and unknown statistics of the noise. Dual-channel architecture is proposed, one is for the estimate of sinusoid and the other for the estimate of the background noise. Simulation and experimental studies were also conducted to evaluate the performances of the developed ASSR detection scheme, and to compare the new method with other conventional techniques. In general, both the state estimation techniques within the detection scheme produced comparable results as compared to the conventional techniques, but achieved significant measurement time reduction in some cases. A guide is given for the determination of the observer gains, while an adaptive algorithm has been used for adjustment of the gains in the Kalman filters.In order to enhance the robustness of the ASSR detection scheme with adaptive Kalman filters against possible artefacts (outliers), a multisensory data fusion approach is used to combine both standard mean operation and median operation in the ASSR detection algorithm. In addition, a self-tuned statistical-based thresholding using the regression technique is applied in the autonomous ASSR detection scheme. The scheme with adaptive Kalman filters is capable of estimating the variances of system and background noise to improve the ASSR detection rate

    Efficient Acquisition and Denoising of Full-Range Event-Related Potentials Following Transient Stimulation of the Auditory Pathway

    Get PDF
    This body of work relates to recent advances in the field of human auditory event-related potentials (ERP), specifically the fast, deconvolution-based ERP acquisition as well as single-response based preprocessing, denoising and subsequent analysis methods. Its goal is the contribution of a cohesive set of methods facilitating the fast, reliable acquisition of the whole electrophysiological response generated by the auditory pathway from the brainstem to the cortex following transient acoustical stimulation. The present manuscript is divided into three sequential areas of investigation : First, the general feasibility of simultaneously acquiring auditory brainstem, middle-latency and late ERP single responses is demonstrated using recordings from 15 normal hearing subjects. Favourable acquisition parameters (i.e., sampling rate, bandpass filter settings and interstimulus intervals) are established, followed by signal analysis of the resulting ERP in terms of their dominant intrinsic scales to determine the properties of an optimal signal representation with maximally reduced sample count by means of nonlinear resampling on a logarithmic timebase. This way, a compression ratio of 16.59 is achieved. Time-scale analysis of the linear-time and logarithmic-time ERP single responses is employed to demonstrate that no important information is lost during compressive resampling, which is additionally supported by a comparative evaluation of the resulting average waveforms - here, all prominent waves remain visible, with their characteristic latencies and amplitudes remaining essentially unaffected by the resampling process. The linear-time and resampled logarithmic-time signal representations are comparatively investigated regarding their susceptibility to the types of physiological and technical noise frequently contaminating ERP recordings. While in principle there already exists a plethora of well-investigated approaches towards the denoising of ERP single-response representations to improve signal quality and/or reduce necessary aquisition times, the substantially altered noise characteristics of the obtained, resampled logarithmic-time single response representations as opposed to their linear-time equivalent necessitates a reevaluation of the available methods on this type of data. Additionally, two novel, efficient denoising algorithms based on transform coefficient manipulation in the sinogram domain and on an analytic, discrete wavelet filterbank are proposed and subjected to a comparative performance evaluation together with two established denoising methods. To facilitate a thorough comparison, the real-world ERP dataset obtained in the first part of this work is employed alongside synthetic data generated using a phenomenological ERP model evaluated at different signal-to-noise ratios (SNR), with individual gains in multiple outcome metrics being used to objectively assess algorithm performances. Results suggest the proposed denoising algorithms to substantially outperform the state-of-the-art methods in terms of the employed outcome metrics as well as their respective processing times. Furthermore, an efficient stimulus sequence optimization method for use with deconvolution-based ERP acquisition methods is introduced, which achieves consistent noise attenuation within a broad designated frequency range. A novel stimulus presentation paradigm for the fast, interleaved acquisition of auditory brainstem, middle-latency and late responses featuring alternating periods of optimized, high-rate deconvolution sequences and subsequent low-rate stimulation is proposed and investigated in 20 normal hearing subjects. Deconvolved sequence responses containing early and middle-latency ERP components are fused with subsequent late responses using a time-frequency resolved weighted averaging method based on cross-trial regularity, yielding a uniform SNR of the full-range auditory ERP across investigated timescales. Obtained average ERP waveforms exhibit morphologies consistent with both literature values and the reference recordings obtained in the first part of this manuscript, with all prominent waves being visible in the grand average waveforms. The novel stimulation approach cuts acquisition time by a factor of 3.4 while at the same time yielding a substantial gain in the SNR of obtained ERP data. Results suggest the proposed interleaved stimulus presentation and associated postprocessing methodology to be suitable for the fast, reliable extraction of full-range neural correlates of auditory processing in future studies.Diese Arbeit steht im Zusammenhang mit aktuellen Entwicklungen auf dem Gebiet der ereigniskorrelierten Potentiale (EKP) des humanen auditorischen Systems, insbesondere der schnellen, entfaltungsbasierten EKP-Aufzeichnung sowie einzelantwortbasierten Vorverarbeitungs-, Entrauschungs- und nachgelagerten Analysemethoden. Ziel ist die Bereitstellung eines vollständigen Methodensatzes, der eine schnelle, zuverlässige Erfassung der gesamten elektrophysiologischen Aktivität entlang der Hörbahn vom Hirnstamm bis zum Cortex ermöglicht, die als Folge transienter akustischer Stimulation auftritt. Das vorliegende Manuskript gliedert sich in drei aufeinander aufbauende Untersuchungsbereiche : Zunächst wird die generelle Machbarkeit der gleichzeitigen Aufzeichnung von Einzelantworten der auditorischen Hirnstammpotentiale zusammen mit mittelspäten und späten EKP anhand von Referenzmessungen an 15 normalhörenden Probanden demonstriert. Es werden hierzu geeignete Erfassungsparameter (Abtastrate, Bandpassfiltereinstellungen und Interstimulusintervalle) ermittelt, gefolgt von einer Signalanalyse der resultierenden EKP im Hinblick auf deren dominante intrinsische Skalen, um auf dieser Grundlage die Eigenschaften einer optimalen Signaldarstellung mit maximal reduzierter Anzahl an Abtastpunkten zu bestimmen, die durch nichtlineare Neuabtastung auf eine logarithmische Zeitbasis realisiert wird. Hierbei wird ein Kompressionsverhältnis von 16.59 erzielt. Zeit-Skalen-Analysen der uniform und logarithmisch abgetasteten EKP-Einzelantworten zeigen, dass bei der kompressiven Neuabtastung keine relevante Information verloren geht, was durch eine vergleichende Auswertung der resultierenden, gemittelten Wellenformen zusätzlich gestützt wird - alle prominenten Wellen bleiben sichtbar und sind hinsichtlich ihrer charakteristischen Latenzen und Amplituden von der Neuabtastung weitgehend unbeeinflusst. Die uniforme und logarithmische Signalrepräsentation werden hinsichtlich ihrer Anfälligkeit für die üblicherweise bei der EKP-Aufzeichnung auftretenden physiologischen und technischen Störquellen vergleichend untersucht. Obwohl bereits eine Fülle von gut etablierten Ansätzen für die Entrauschung von EKP-Einzelantwortdarstellungen zur Verbesserung der Signalqualität und/oder zur Reduktion der benötigten Erfassungszeiten existiert, erfordern die wesentlich veränderten Störeigenschaften der vorliegenden, logarithmisch abgetasteten Einzelantwortdarstellungen im Gegensatz zu ihrem uniformen Äquivalent eine Neubewertung der verfügbaren Methoden für diese Art von Daten. Darüber hinaus werden zwei neuartige, effiziente Entrauschungsalgorithmen geboten, die auf der Koeffizientenmanipulation einer Sinogramm-Repräsentation bzw. einer analytischen, diskreten Wavelet-Zerlegung der Einzelantworten basieren und gemeinsam mit zwei etablierten Entrauschungsmethoden einer vergleichenden Leistungsbewertung unterzogen werden. Um einen umfassenden Vergleich zu ermöglichen, werden der im ersten Teil dieser Arbeit erhaltene EKP-Messdatensatz sowie synthetischen Daten eingesetzt, die mithilfe eines phänomenologischen EKP-Modells bei verschiedenen Signal-Rausch-Abständen (SRA) erzeugt wurden, wobei die individuellen Anstiege in mehreren Zielmetriken zur objektiven Bewertung der Performanz herangezogen werden. Die erhaltenen Ergebnisse deuten darauf hin, dass die vorgeschlagenen Entrauschungsalgorithmen die etablierten Methoden sowohl in den eingesetzten Zielmetriken als auch mit Blick auf die Laufzeiten deutlich übertreffen. Weiterhin wird ein effizientes Reizsequenzoptimierungsverfahren für den Einsatz mit entfaltungsbasierten EKP-Aufzeichnungsmethoden vorgestellt, das eine konsistente Rauschunterdrückung innerhalb eines breiten Frequenzbands erreicht. Ein neuartiges Stimulus-Präsentationsparadigma für die schnelle, verschachtelte Erfassung auditorischer Hirnstammpotentiale, mittlelspäter und später Antworten durch alternierende Darbietung von optimierten, dichter Stimulussequenzen und nachgelagerter, langsamer Einzelstimulation wird eingeführt und in 20 normalhörenden Probanden evaluiert. Entfaltete Sequenzantworten, die frühe und mittlere EKP enthalten, werden mit den nachfolgenden späten Antworten fusioniert, wobei eine Zeit-Frequenz-aufgelöste, gewichtete Mittelung unter Berücksichtigung von Regularität über Einzelantworten hinweg zum Einsatz kommt. Diese erreicht einheitliche SRA der resultierenden EKP-Signale über alle untersuchten Zeitskalen hinweg. Die erhaltenen, gemittelten EKP-Wellenformen weisen Morphologien auf, die sowohl mit einschlägigen Literaturwerten als auch mit den im ersten Teil dieses Manuskripts erhaltenen Referenzaufnahmen konsistent sind, wobei alle markanten Wellen deutlich in den Gesamtmittelwerten sichtbar sind. Das neuartige Stimulationsparadigma verkürzt die Erfassungszeit um den Faktor 3.4 und vergrößert gleichzeitig den erreichten SRA erheblich. Die Ergebnisse deuten darauf hin, dass die vorgeschlagene verschachtelte Stimuluspräsentation und die nachgelagerte EKP-Verarbeitungsmethodik zur schnellen, zuverlässigen Extraktion neuronaler Korrelate der gesamten auditorischen Verarbeitung im Rahmen zukünftiger Studien geeignet sind.Bundesministerium für Bildung und Forschung | Bimodal Fusion - Eine neurotechnologische Optimierungsarchitektur für integrierte bimodale Hörsysteme | 2016-201

    Methods of Optimizing Speech Enhancement for Hearing Applications

    Get PDF
    Speech intelligibility in hearing applications suffers from background noise. One of the most effective solutions is to develop speech enhancement algorithms based on the biological traits of the auditory system. In humans, the medial olivocochlear (MOC) reflex, which is an auditory neural feedback loop, increases signal-in-noise detection by suppressing cochlear response to noise. The time constant is one of the key attributes of the MOC reflex as it regulates the variation of suppression over time. Different time constants have been measured in nonhuman mammalian and human auditory systems. Physiological studies reported that the time constant of nonhuman mammalian MOC reflex varies with the properties (e.g. frequency, bandwidth) changes of the stimulation. A human based study suggests that time constant could vary when the bandwidth of the noise is changed. Previous works have developed MOC reflex models and successfully demonstrated the benefits of simulating the MOC reflex for speech-in-noise recognition. However, they often used fixed time constants. The effect of the different time constants on speech perception remains unclear. The main objectives of the present study are (1) to study the effect of the MOC reflex time constant on speech perception in different noise conditions; (2) to develop a speech enhancement algorithm with dynamic time constant optimization to adapt to varying noise conditions for improving speech intelligibility. The first part of this thesis studies the effect of the MOC reflex time constants on speech-in-noise perception. Conventional studies do not consider the relationship between the time constants and speech perception as it is difficult to measure the speech intelligibility changes due to varying time constants in human subjects. We use a model to investigate the relationship by incorporating Meddis’ peripheral auditory model (which includes a MOC reflex) with an automatic speech recognition (ASR) system. The effect of the MOC reflex time constant is studied by adjusting the time constant parameter of the model and testing the speech recognition accuracy of the ASR. Different time constants derived from human data are evaluated in both speech-like and non-speech like noise at the SNR levels from -10 dB to 20 dB and clean speech condition. The results show that the long time constants (≥1000 ms) provide a greater improvement of speech recognition accuracy at SNR levels≤10 dB. Maximum accuracy improvement of 40% (compared to no MOC condition) is shown in pink noise at the SNR of 10 dB. Short time constants (<1000 ms) show recognition accuracy over 5% higher than the longer ones at SNR levels ≥15 dB. The second part of the thesis develops a novel speech enhancement algorithm based on the MOC reflex with a time constant that is dynamically optimized, according to a lookup table for varying SNRs. The main contributions of this part include: (1) So far, the existing SNR estimation methods are challenged in cases of low SNR, nonstationary noise, and computational complexity. High computational complexity would increase processing delay that causes intelligibility degradation. A variance of spectral entropy (VSE) based SNR estimation method is developed as entropy based features have been shown to be more robust in the cases of low SNR and nonstationary noise. The SNR is estimated according to the estimated VSE-SNR relationship functions by measuring VSE of noisy speech. Our proposed method has an accuracy of 5 dB higher than other methods especially in the babble noise with fewer talkers (2 talkers) and low SNR levels (< 0 dB), with averaging processing time only about 30% of the noise power estimation based method. The proposed SNR estimation method is further improved by implementing a nonlinear filter-bank. The compression of the nonlinear filter-bank is shown to increase the stability of the relationship functions. As a result, the accuracy is improved by up to 2 dB in all types of tested noise. (2) A modification of Meddis’ MOC reflex model with a time constant dynamically optimized against varying SNRs is developed. The model incudes simulated inner hair cell response to reduce the model complexity, and now includes the SNR estimation method. Previous MOC reflex models often have fixed time constants that do not adapt to varying noise conditions, whilst our modified MOC reflex model has a time constant dynamically optimized according to the estimated SNRs. The results show a speech recognition accuracy of 8 % higher than the model using a fixed time constant of 2000 ms in different types of noise. (3) A speech enhancement algorithm is developed based on the modified MOC reflex model and implemented in an existing hearing aid system. The performance is evaluated by measuring the objective speech intelligibility metric of processed noisy speech. In different types of noise, the proposed algorithm increases intelligibility at least 20% in comparison to unprocessed noisy speech at SNRs between 0 dB and 20 dB, and over 15 % in comparison to processed noisy speech using the original MOC based algorithm in the hearing aid

    Coding Strategies for Cochlear Implants Under Adverse Environments

    Get PDF
    Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise

    Adaptive techniques for signal enhancement in the human electroencephalogram

    Get PDF
    This thesis describes an investigation of adaptive noise cancelling applied to human brain evoked potentials (EPs), with particular emphasis on visually evoked responses. The chief morphological features and signal properties of EPs are described. Consideration is given to the amplitude and spectral properties of the underlying spontaneous electroencephalogram and the importance of noise reduction techniques in EP studies is empnasised. A number of methods of enhancing EP waveforms are reviewed in the light of the known limitations of coherent signal averaging. These are shown to oe generally inadequate for enhancing individual EP responses. The theory of adaptive filters is reviewed with particular reference to adaptive transversal filters usiny the Widrow-Hoff algorithm. The theory of adaptive noise cancelling using correlated reference sources is presented, and new work is described which relates canceller performance to the magnitude-squared coherence function of the input signals. A novel filter structure, the gated adaptive filter, is presented and shown to yield improved cancellation without signal distortion when applied to repetitive transient signals in stationary noise under the condition of fast adaption. The signal processing software available is shown to be inadequate, and a comprehensive Fortran program developed for use on a PDP-11 computer is described. The properties of human visual evoked potentials and the EEO are investigated in two normal adults using a montage of 7 occipital electrodes. Signal enhancement of EPs is shown to be possible oy adaptive noise cancelling, and improvements in signal to noise in the range 2-10 dB are predicted. A discussion of filter strategies is presented, and a detailed investiyation of adaptive noise cancel liny performed usiny a ranye of typical EP data. Assessment of the results confirms the proposal that substantial improvement in sinyle EP response recoynition is achieved by this technique

    Spatio-Temporal Approaches to Denoising and Feature Extraction in Rapid Image Triage

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Spatial Filtering of Magnetoencephalographic Data in Spherical Harmonics Domain

    Get PDF
    We introduce new spatial filtering methods in the spherical harmonics domain for constraining magnetoencephalographic (MEG) multichannel measurements to user-specified sphericalregions of interests (ROI) inside the head. The main idea of the spatial filtering is to emphasize those signals arising from an ROI, while suppressing the signals coming from outsidethe ROI. We exploit a well-known method called the signal space separation (SSS), whichcan decompose MEG data into a signal component generated by neurobiological sourcesand a noise component generated by external sources outside the head. The novel methodspresented in this work, expanded SSS (exSSS) and generalized expanded SSS (genexSSS)utilize a beamspace optimization criterion in order to linearly transform the inner signal SSScoefficients to represent the sources belonging to the ROI. The filters mainly depend on theradius and the center of the ROI. The simplicity of the derived formulations of our methodsstems from the natural appropriateness to spherical domain and orthogonality properties ofthe SSS basis functions that are intimately related to the vector spherical harmonics. Thus,unlike the traditional MEG spatial filtering techniques, exSSS and genexSSS do not needany numerical computation procedures on discretized headspace. The validation and performance of the algorithms are demonstrated by experiments utilizing both simulated and realMEG data

    ERP source tracking and localization from single trial EEG MEG signals

    Get PDF
    Electroencephalography (EEG) and magnetoencephalography (MEG), which are two of a number of neuroimaging techniques, are scalp recordings of the electrical activity of the brain. EEG and MEG (E/MEG) have excellent temporal resolution, they are easy to acquire, and have a wide range of applications in science, medicine and engineering. These valuable signals, however, suffer from poor spatial resolution and in many cases from very low signal to noise ratios. In this study, new computational methods for analyzing and improving the quality of E/MEG signals are presented. We mainly focus on single trial event-related potential (ERP) estimation and E/MEG dipole source localization. Several methods basically based on particle filtering (PF) are proposed. First, a method using PF for single trial estimation of ERP signals is considered. In this method, the wavelet coefficients of each ERP are assumed to be a Markovian process and do not change extensively across trials. The wavelet coefficients are then estimated recursively using PF. The results both for simulations and real data are compared with those of the well known Kalman Filtering (KF) approach. In the next method we move from single trial estimation to source localization of E/MEG signals. The beamforming (BF) approach for dipole source localization is generalized based on prior information about the noise. BF is in fact a spatial filter that minimizes the power of all the signals at the output of the filter except those that come from the locations of interest. In the proposed method, using two more constraints than in the classical BF formulation, the output noise powers are minimized and the interference activities are stopped.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore