1,734 research outputs found
Acoustically driven cortical delta oscillations underpin prosodic chunking
Oscillation-based models of speech perception postulate a cortical computational principle by which decoding is performed within a window structure derived by a segmentation process. Segmentation of syllable-size chunks is realized by a theta oscillator. We provide evidence for an analogous role of a delta oscillator in the segmentation of phrase-sized chunks. We recorded Magnetoencephalography (MEG) in humans, while participants performed a target identification task. Random-digit strings, with phrase-long chunks of two digits, were presented at chunk rates of 1.8 Hz or 2.6 Hz, inside or outside the delta frequency band (defined here to be 0.5 - 2 Hz). Strong periodicities were elicited by chunk rates inside of delta in superior, middle temporal areas and speech-motor integration areas. Periodicities were diminished or absent for chunk rates outside delta, in line with behavioral performance. Our findings show that prosodic chunking of phrase-sized acoustic segments is correlated with acoustic-driven delta oscillations, expressing anatomically specific patterns of neuronal periodicities
Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization
Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State Vowel Identification
Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
Temporal Coding of Periodicity Pitch in the Auditory System: An Overview
This paper outlines a taxonomy of neural pulse codes and reviews neurophysiological evidence for interspike interval-based representations for pitch and timbre in the auditory nerve and cochlear nucleus. Neural pulse codes can be divided into channel-based codes, temporal-pattern codes, and time-of-arrival codes. Timings of discharges in auditory nerve fibers reflect the time structure of
acoustic waveforms, such that the interspike intervals that are produced precisely convey information concerning stimulus periodicities.
Population-wide inter-spike interval distributions are constructed by summing together intervals from the observed responses of many single Type I auditory nerve fibers. Features in such distributions correspond closely with pitches that are heard by human listeners. The most common all-order interval present in the auditory nerve array almost invariably corresponds to the pitch frequency, whereas the relative fraction of pitchrelated intervals amongst all others qualitatively corresponds to the strength of the pitch. Consequently, many diverse aspects of pitch perception are explained in terms of such temporal representations. Similar stimulus-driven temporal discharge patterns are observed in major neuronal populations of the cochlear nucleus. Population-interval distributions constitute an alternative time-domain strategy for representing sensory information that complements spatially organized sensory maps. Similar autocorrelation-like representations are possible in other sensory systems, in which neural discharges are time-locked to stimulus waveforms
The temporal pattern of impulses in primary afferents analogously encodes touch and hearing information
An open question in neuroscience is the contribution of temporal relations between individual impulses in primary afferents in conveying sensory information. We investigated this question in touch and hearing, while looking for any shared coding scheme. In both systems, we artificially induced temporally diverse afferent impulse trains and probed the evoked perceptions in human subjects using psychophysical techniques.
First, we investigated whether the temporal structure of a fixed number of impulses conveys information about the magnitude of tactile intensity. We found that clustering the impulses into periodic bursts elicited graded increases of intensity as a function of burst impulse count, even though fewer afferents were recruited throughout the longer bursts.
The interval between successive bursts of peripheral neural activity (the burst-gap) has been demonstrated in our lab to be the most prominent temporal feature for coding skin vibration frequency, as opposed to either spike rate or periodicity. Given the similarities between tactile and auditory systems, second, we explored the auditory system for an equivalent neural coding strategy. By using brief acoustic pulses, we showed that the burst-gap is a shared temporal code for pitch perception between the modalities.
Following this evidence of parallels in temporal frequency processing, we next assessed the perceptual frequency equivalence between the two modalities using auditory and tactile pulse stimuli of simple and complex temporal features in cross-sensory frequency discrimination experiments. Identical temporal stimulation patterns in tactile and auditory afferents produced equivalent perceived frequencies, suggesting an analogous temporal frequency computation mechanism.
The new insights into encoding tactile intensity through clustering of fixed charge electric pulses into bursts suggest a novel approach to convey varying contact forces to neural interface users, requiring no modulation of either stimulation current or base pulse frequency. Increasing control of the temporal patterning of pulses in cochlear implant users might improve pitch perception and speech comprehension. The perceptual correspondence between touch and hearing not only suggests the possibility of establishing cross-modal comparison standards for robust psychophysical investigations, but also supports the plausibility of cross-sensory substitution devices
Neural Models of Subcortical Auditory Processing
An important feature of the auditory system is its ability to distinguish many simultaneous
sound sources. The primary goal of this work was to understand how a robust, preattentive
analysis of the auditory scene is accomplished by the subcortical auditory system.
Reasonably accurate modelling of the morphology and organisation of the relevant auditory
nuclei, was seen as being of great importance. The formulation of plausible models and their
subsequent simulation was found to be invaluable in elucidating biological processes and in
highlighting areas of uncertainty.
In the thesis, a review of important aspects of mammalian auditory processing is presented
and used as a basis for the subsequent modelling work. For each aspect of auditory
processing modelled, psychophysical results are described and existing models reviewed,
before the models used here are described and simulated. Auditory processes which are
modelled include the peripheral system, and the production of tonotopic maps of the
spectral content of complex acoustic stimuli, and of modulation frequency or periodicity. A
model of the formation of sequential associations between successive sounds is described,
and the model is shown to be capable of emulating a wide range of psychophysical
behaviour. The grouping of related spectral components and the development of pitch
perception is also investigated. Finally a critical assessment of the work and ideas for future
developments are presented.
The principal contributions of this work are the further development of a model for pitch
perception and the development of a novel architecture for the sequential association of
those groups. In the process of developing these ideas, further insights into subcortical
auditory processing were gained, and explanations for a number of puzzling psychophysical
characteristics suggested.Royal Naval Engineering College, Manadon, Plymout
Recommended from our members
Periodicity and frequency coding in human auditory cortex
Understanding the neural coding of pitch and frequency is fundamental to the understanding of speech comprehension, music perception and the segregation of concurrent sound sources. Neuroimaging has made important contributions to defining the pattern of frequency sensitivity in humans. However, the precise way in which pitch sensitivity relates to these frequency-dependent regions remains unclear. Single-frequency tones also cannot be used to test this hypothesis as their pitch always equals their frequency. Here, temporal pitch (periodicity) and frequency coding were dissociated using stimuli that were bandpassed in different frequency spectra (centre frequencies 800 and 4500 Hz), yet were matched in their pitch characteristics. Cortical responses to both pitch-evoking stimuli typically occurred within a region that was also responsive to low frequencies. Its location extended across both primary and nonprimary auditory cortex. An additional control experiment demonstrated that this pitch-related effect was not simply caused by the generation of combination tones. Our findings support recent neurophysiological evidence for a cortical representation of pitch at the lateral border of the primary auditory cortex, while revealing new evidence that additional auditory fields are also likely to play a role in pitch coding
Effects of acoustic periodicity and intelligibility on the neural oscillations in response to speech
Although several studies have investigated neural oscillations in response to acoustically degraded speech, it is still a matter of debate which neural frequencies reflect speech intelligibility. Part of the problem is that effects of acoustics and intelligibility have so far not been considered independently. In the current electroencephalography (EEG) study the amount of acoustic periodicity (i.e. the amount of time the stimulus sentences were voiced) was manipulated, while using the listeners’ spoken responses to control for differences in intelligibility. Firstly, the total EEG power changes in response to completely aperiodic (noise-vocoded) speech and speech with a natural mix of periodicity and aperiodicity were almost identical, while an increase in theta power (5–6.3 Hz) and a trend for less beta power (11–18 Hz) were observed in response to completely periodic speech. These two effects are taken to indicate an information processing conflict caused by the unnatural acoustic properties of the stimuli, and that the subjects may have internally rehearsed the sentences as a result of this. Secondly, we separately investigated effects of intelligibility by sorting the trials in the periodic condition according to the listeners’ spoken responses. The comparison of intelligible and largely unintelligible trials revealed that the total EEG power in the delta band (1.7–2.7 Hz) was markedly increased during the second half of the intelligible trials, which suggests that delta oscillations are an indicator of successful speech understanding
- …