1,652 research outputs found

    Linking Speech Perception and Neurophysiology: Speech Decoding Guided by Cascaded Oscillators Locked to the Input Rhythm

    Get PDF
    The premise of this study is that current models of speech perception, which are driven by acoustic features alone, are incomplete, and that the role of decoding time during memory access must be incorporated to account for the patterns of observed recognition phenomena. It is postulated that decoding time is governed by a cascade of neuronal oscillators, which guide template-matching operations at a hierarchy of temporal scales. Cascaded cortical oscillations in the theta, beta, and gamma frequency bands are argued to be crucial for speech intelligibility. Intelligibility is high so long as these oscillations remain phase locked to the auditory input rhythm. A model (Tempo) is presented which is capable of emulating recent psychophysical data on the intelligibility of speech sentences as a function of “packaging” rate (Ghitza and Greenberg, 2009). The data show that intelligibility of speech that is time-compressed by a factor of 3 (i.e., a high syllabic rate) is poor (above 50% word error rate), but is substantially restored when the information stream is re-packaged by the insertion of silent gaps in between successive compressed-signal intervals – a counterintuitive finding, difficult to explain using classical models of speech perception, but emerging naturally from the Tempo architecture

    Listening in large rooms : a neurophysiological investigations of acoustical conditions that influence speech intelligibility

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Whitaker College of Health Sciences and Technology, 1997.Includes bibliographical references (p. 34-37).by Benjamin Michael Hammond.M.S

    The effects of auditory contrast tuning upon speech intelligibility

    Get PDF
    We have previously identified neurons tuned to spectral contrast of wideband sounds in auditory cortex of awake marmoset monkeys. Because additive noise alters the spectral contrast of speech, contrast-tuned neurons, if present in human auditory cortex, may aid in extracting speech from noise. Given that this cortical function may be underdeveloped in individuals with sensorineural hearing loss, incorporating biologically-inspired algorithms into external signal processing devices could provide speech enhancement benefits to cochlear implantees. In this study we first constructed a computational signal processing algorithm to mimic auditory cortex contrast tuning. We then manipulated the shape of contrast channels and evaluated the intelligibility of reconstructed noisy speech using a metric to predict cochlear implant user perception. Candidate speech enhancement strategies were then tested in cochlear implantees with a hearing-in-noise test. Accentuation of intermediate contrast values or all contrast values improved computed intelligibility. Cochlear implant subjects showed significant improvement in noisy speech intelligibility with a contrast shaping procedure

    Investigating the Neural Basis of Audiovisual Speech Perception with Intracranial Recordings in Humans

    Get PDF
    Speech is inherently multisensory, containing auditory information from the voice and visual information from the mouth movements of the talker. Hearing the voice is usually sufficient to understand speech, however in noisy environments or when audition is impaired due to aging or disabilities, seeing mouth movements greatly improves speech perception. Although behavioral studies have well established this perceptual benefit, it is still not clear how the brain processes visual information from mouth movements to improve speech perception. To clarify this issue, I studied the neural activity recorded from the brain surfaces of human subjects using intracranial electrodes, a technique known as electrocorticography (ECoG). First, I studied responses to noisy speech in the auditory cortex, specifically in the superior temporal gyrus (STG). Previous studies identified the anterior parts of the STG as unisensory, responding only to auditory stimulus. On the other hand, posterior parts of the STG are known to be multisensory, responding to both auditory and visual stimuli, which makes it a key region for audiovisual speech perception. I examined how these different parts of the STG respond to clear versus noisy speech. I found that noisy speech decreased the amplitude and increased the across-trial variability of the response in the anterior STG. However, possibly due to its multisensory composition, posterior STG was not as sensitive to auditory noise as the anterior STG and responded similarly to clear and noisy speech. I also found that these two response patterns in the STG were separated by a sharp boundary demarcated by the posterior-most portion of the Heschl’s gyrus. Second, I studied responses to silent speech in the visual cortex. Previous studies demonstrated that visual cortex shows response enhancement when the auditory component of speech is noisy or absent, however it was not clear which regions of the visual cortex specifically show this response enhancement and whether this response enhancement is a result of top-down modulation from a higher region. To test this, I first mapped the receptive fields of different regions in the visual cortex and then measured their responses to visual (silent) and audiovisual speech stimuli. I found that visual regions that have central receptive fields show greater response enhancement to visual speech, possibly because these regions receive more visual information from mouth movements. I found similar response enhancement to visual speech in frontal cortex, specifically in the inferior frontal gyrus, premotor and dorsolateral prefrontal cortices, which have been implicated in speech reading in previous studies. I showed that these frontal regions display strong functional connectivity with visual regions that have central receptive fields during speech perception

    The effects of monaural and binaural cues on perceived reverberation by normal hearing and hearing-impaired listeners.

    Get PDF
    This dissertation is a quantitative and qualitative examination of how young normal hearing and young hearing-impaired listeners perceive reverberation. A primary complaint among hearing-impaired listeners is difficulty understanding speech in noisy or reverberant environments. This work was motivated by a desire to better understand reverberation perception and processing so that this knowledge might be used to improve outcomes for hearing-impaired listeners in these environments. This dissertation is written in six chapters. Chapter One is an introduction to the field and a review of the relevant literature. Chapter Two describes a motivating experiment from laboratory work completed before the dissertation. This experiment asked human subjects to rate the amount of reverberation they perceived in a sound relative to another sound. This experiment showed a significant effect of listening condition on how listeners made their judgments. Chapter Three follows up on this experiment, seeking a better understanding of how listeners perform the task in Chapter Two. Chapter Three shows that listeners can use limited information to make their judgments. Chapter Four compares reverberation perception in normal hearing and hearing-impaired listeners and examines the effect of speech intelligibility on reverberation perception. This experiment finds no significant differences between cues used by normal hearing and hearing-impaired listeners when judging perceptual aspects of reverberation. Chapter Five describes and uses a quantitative model to examine the results of Chapters Two and Four. Chapter Six summarizes the data presented in the dissertation and discusses potential implications and future directions. This work finds that the perceived amount of reverberation relies primarily on two factors: 1) the listening condition (i.e., binaural, monaural, or a listening condition in which reverberation is present only in one ear) and 2) the sum of reverberant energy present at the two ears. Listeners do not need the reverberant tail to estimate perceived amount of reverberation, meaning that listeners are able to extract information about reverberation from the ongoing signal. The precise mechanism underlying this process is not explicitly found in this work; however, a potential framework is presented in Chapter Six

    Communications Biophysics

    Get PDF
    Contains reports on seven research projects split into three sections.National Institutes of Health (Grant 5 PO1 NS13126)National Institutes of Health (Grant 1 RO1 NS18682)National Institutes of Health (Training Grant 5 T32 NS07047)National Science Foundation (Grant BNS77-16861)National Institutes of Health (Grant 1 F33 NS07202-01)National Institutes of Health (Grant 5 RO1 NS10916)National Institutes of Health (Grant 5 RO1 NS12846)National Institutes of Health (Grant 1 RO1 NS16917)National Institutes of Health (Grant 1 RO1 NS14092-05)National Science Foundation (Grant BNS 77 21751)National Institutes of Health (Grant 5 R01 NS11080)National Institutes of Health (Grant GM-21189

    The nicotinic receptor of cochlear hair cells: A possible pharmacotherapeutic target?

    Get PDF
    Mechanosensory hair cells of the organ of Corti transmit information regarding sound to the central nervous system by way of peripheral afferent neurons. In return, the central nervous system provides feedback and modulates the afferent stream of information through efferent neurons. The medial olivocochlear efferent system makes direct synaptic contacts with outer hair cells and inhibits amplification brought about by the active mechanical process inherent to these cells. This feedback system offers the potential to improve the detection of signals in background noise, to selectively attend to particular signals, and to protect the periphery from damage caused by overly loud sounds. Acetylcholine released at the synapse between efferent terminals and outer hair cells activates a peculiar nicotinic cholinergic receptor subtype, the α9α10 receptor. At present no pharmacotherapeutic approaches have been designed that target this cholinergic receptor to treat pathologies of the auditory system. The potential use of α9α10 selective drugs in conditions such as noise-induced hearing loss, tinnitus and auditory processing disorders is discussed.Fil: Elgoyhen, Ana Belen. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; Argentina. Universidad de Buenos Aires. Facultad de Medicina. Departamento de Farmacología; ArgentinaFil: Katz, Eleonora. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Fisiología, Biología Molecular y Celular; ArgentinaFil: Fuchs, Paul A.. The Johns Hopkins University School of Medicine; Estados Unido

    Predicting Speech Intelligibility

    Get PDF
    Hearing impairment, and specifically sensorineural hearing loss, is an increasingly prevalent condition, especially amongst the ageing population. It occurs primarily as a result of damage to hair cells that act as sound receptors in the inner ear and causes a variety of hearing perception problems, most notably a reduction in speech intelligibility. Accurate diagnosis of hearing impairments is a time consuming process and is complicated by the reliance on indirect measurements based on patient feedback due to the inaccessible nature of the inner ear. The challenges of designing hearing aids to counteract sensorineural hearing losses are further compounded by the wide range of severities and symptoms experienced by hearing impaired listeners. Computer models of the auditory periphery have been developed, based on phenomenological measurements from auditory-nerve fibres using a range of test sounds and varied conditions. It has been demonstrated that auditory-nerve representations of vowels in normal and noisedamaged ears can be ranked by a subjective visual inspection of how the impaired representations differ from the normal. This thesis seeks to expand on this procedure to use full word tests rather than single vowels, and to replace manual inspection with an automated approach using a quantitative measure. It presents a measure that can predict speech intelligibility in a consistent and reproducible manner. This new approach has practical applications as it could allow speechprocessing algorithms for hearing aids to be objectively tested in early stage development without having to resort to extensive human trials. Simulated hearing tests were carried out by substituting real listeners with the auditory model. A range of signal processing techniques were used to measure the model’s auditory-nerve outputs by presenting them spectro-temporally as neurograms. A neurogram similarity index measure (NSIM) was developed that allowed the impaired outputs to be compared to a reference output from a normal hearing listener simulation. A simulated listener test was developed, using standard listener test material, and was validated for predicting normal hearing speech intelligibility in quiet and noisy conditions. Two types of neurograms were assessed: temporal fine structure (TFS) which retained spike timing information; and average discharge rate or temporal envelope (ENV). Tests were carried out to simulate a wide range of sensorineural hearing losses and the results were compared to real listeners’ unaided and aided performance. Simulations to predict speech intelligibility performance of NAL-RP and DSL 4.0 hearing aid fitting algorithms were undertaken. The NAL-RP hearing aid fitting algorithm was adapted using a chimaera sound algorithm which aimed to improve the TFS speech cues available to aided hearing impaired listeners. NSIM was shown to quantitatively rank neurograms with better performance than a relative mean squared error and other similar metrics. Simulated performance intensity functions predicted speech intelligibility for normal and hearing impaired listeners. The simulated listener tests demonstrated that NAL-RP and DSL 4.0 performed with similar speech intelligibility restoration levels. Using NSIM and a computational model of the auditory periphery, speech intelligibility can be predicted for both normal and hearing impaired listeners and novel hearing aids can be rapidly prototyped and evaluated prior to real listener tests

    Physiology-based model of multi-source auditory processing

    Full text link
    Our auditory systems are evolved to process a myriad of acoustic environments. In complex listening scenarios, we can tune our attention to one sound source (e.g., a conversation partner), while monitoring the entire acoustic space for cues we might be interested in (e.g., our names being called, or the fire alarm going off). While normal hearing listeners handle complex listening scenarios remarkably well, hearing-impaired listeners experience difficulty even when wearing hearing-assist devices. This thesis presents both theoretical work in understanding the neural mechanisms behind this process, as well as the application of neural models to segregate mixed sources and potentially help the hearing impaired population. On the theoretical side, auditory spatial processing has been studied primarily up to the midbrain region, and studies have shown how individual neurons can localize sounds using spatial cues. Yet, how higher brain regions such as the cortex use this information to process multiple sounds in competition is not clear. This thesis demonstrates a physiology-based spiking neural network model, which provides a mechanism illustrating how the auditory cortex may organize up-stream spatial information when there are multiple competing sound sources in space. Based on this model, an engineering solution to help hearing-impaired listeners segregate mixed auditory inputs is proposed. Using the neural model to perform sound-segregation in the neural domain, the neural outputs (representing the source of interest) are reconstructed back to the acoustic domain using a novel stimulus reconstruction method.2017-09-22T00:00:00
    • …
    corecore