1,734 research outputs found

    Acoustically driven cortical delta oscillations underpin prosodic chunking

    Get PDF
    Oscillation-based models of speech perception postulate a cortical computational principle by which decoding is performed within a window structure derived by a segmentation process. Segmentation of syllable-size chunks is realized by a theta oscillator. We provide evidence for an analogous role of a delta oscillator in the segmentation of phrase-sized chunks. We recorded Magnetoencephalography (MEG) in humans, while participants performed a target identification task. Random-digit strings, with phrase-long chunks of two digits, were presented at chunk rates of 1.8 Hz or 2.6 Hz, inside or outside the delta frequency band (defined here to be 0.5 - 2 Hz). Strong periodicities were elicited by chunk rates inside of delta in superior, middle temporal areas and speech-motor integration areas. Periodicities were diminished or absent for chunk rates outside delta, in line with behavioral performance. Our findings show that prosodic chunking of phrase-sized acoustic segments is correlated with acoustic-driven delta oscillations, expressing anatomically specific patterns of neuronal periodicities

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State Vowel Identification

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Temporal Coding of Periodicity Pitch in the Auditory System: An Overview

    Get PDF
    This paper outlines a taxonomy of neural pulse codes and reviews neurophysiological evidence for interspike interval-based representations for pitch and timbre in the auditory nerve and cochlear nucleus. Neural pulse codes can be divided into channel-based codes, temporal-pattern codes, and time-of-arrival codes. Timings of discharges in auditory nerve fibers reflect the time structure of acoustic waveforms, such that the interspike intervals that are produced precisely convey information concerning stimulus periodicities. Population-wide inter-spike interval distributions are constructed by summing together intervals from the observed responses of many single Type I auditory nerve fibers. Features in such distributions correspond closely with pitches that are heard by human listeners. The most common all-order interval present in the auditory nerve array almost invariably corresponds to the pitch frequency, whereas the relative fraction of pitchrelated intervals amongst all others qualitatively corresponds to the strength of the pitch. Consequently, many diverse aspects of pitch perception are explained in terms of such temporal representations. Similar stimulus-driven temporal discharge patterns are observed in major neuronal populations of the cochlear nucleus. Population-interval distributions constitute an alternative time-domain strategy for representing sensory information that complements spatially organized sensory maps. Similar autocorrelation-like representations are possible in other sensory systems, in which neural discharges are time-locked to stimulus waveforms

    The temporal pattern of impulses in primary afferents analogously encodes touch and hearing information

    Full text link
    An open question in neuroscience is the contribution of temporal relations between individual impulses in primary afferents in conveying sensory information. We investigated this question in touch and hearing, while looking for any shared coding scheme. In both systems, we artificially induced temporally diverse afferent impulse trains and probed the evoked perceptions in human subjects using psychophysical techniques. First, we investigated whether the temporal structure of a fixed number of impulses conveys information about the magnitude of tactile intensity. We found that clustering the impulses into periodic bursts elicited graded increases of intensity as a function of burst impulse count, even though fewer afferents were recruited throughout the longer bursts. The interval between successive bursts of peripheral neural activity (the burst-gap) has been demonstrated in our lab to be the most prominent temporal feature for coding skin vibration frequency, as opposed to either spike rate or periodicity. Given the similarities between tactile and auditory systems, second, we explored the auditory system for an equivalent neural coding strategy. By using brief acoustic pulses, we showed that the burst-gap is a shared temporal code for pitch perception between the modalities. Following this evidence of parallels in temporal frequency processing, we next assessed the perceptual frequency equivalence between the two modalities using auditory and tactile pulse stimuli of simple and complex temporal features in cross-sensory frequency discrimination experiments. Identical temporal stimulation patterns in tactile and auditory afferents produced equivalent perceived frequencies, suggesting an analogous temporal frequency computation mechanism. The new insights into encoding tactile intensity through clustering of fixed charge electric pulses into bursts suggest a novel approach to convey varying contact forces to neural interface users, requiring no modulation of either stimulation current or base pulse frequency. Increasing control of the temporal patterning of pulses in cochlear implant users might improve pitch perception and speech comprehension. The perceptual correspondence between touch and hearing not only suggests the possibility of establishing cross-modal comparison standards for robust psychophysical investigations, but also supports the plausibility of cross-sensory substitution devices

    Neural Models of Subcortical Auditory Processing

    Get PDF
    An important feature of the auditory system is its ability to distinguish many simultaneous sound sources. The primary goal of this work was to understand how a robust, preattentive analysis of the auditory scene is accomplished by the subcortical auditory system. Reasonably accurate modelling of the morphology and organisation of the relevant auditory nuclei, was seen as being of great importance. The formulation of plausible models and their subsequent simulation was found to be invaluable in elucidating biological processes and in highlighting areas of uncertainty. In the thesis, a review of important aspects of mammalian auditory processing is presented and used as a basis for the subsequent modelling work. For each aspect of auditory processing modelled, psychophysical results are described and existing models reviewed, before the models used here are described and simulated. Auditory processes which are modelled include the peripheral system, and the production of tonotopic maps of the spectral content of complex acoustic stimuli, and of modulation frequency or periodicity. A model of the formation of sequential associations between successive sounds is described, and the model is shown to be capable of emulating a wide range of psychophysical behaviour. The grouping of related spectral components and the development of pitch perception is also investigated. Finally a critical assessment of the work and ideas for future developments are presented. The principal contributions of this work are the further development of a model for pitch perception and the development of a novel architecture for the sequential association of those groups. In the process of developing these ideas, further insights into subcortical auditory processing were gained, and explanations for a number of puzzling psychophysical characteristics suggested.Royal Naval Engineering College, Manadon, Plymout

    Effects of acoustic periodicity and intelligibility on the neural oscillations in response to speech

    Get PDF
    Although several studies have investigated neural oscillations in response to acoustically degraded speech, it is still a matter of debate which neural frequencies reflect speech intelligibility. Part of the problem is that effects of acoustics and intelligibility have so far not been considered independently. In the current electroencephalography (EEG) study the amount of acoustic periodicity (i.e. the amount of time the stimulus sentences were voiced) was manipulated, while using the listeners’ spoken responses to control for differences in intelligibility. Firstly, the total EEG power changes in response to completely aperiodic (noise-vocoded) speech and speech with a natural mix of periodicity and aperiodicity were almost identical, while an increase in theta power (5–6.3 Hz) and a trend for less beta power (11–18 Hz) were observed in response to completely periodic speech. These two effects are taken to indicate an information processing conflict caused by the unnatural acoustic properties of the stimuli, and that the subjects may have internally rehearsed the sentences as a result of this. Secondly, we separately investigated effects of intelligibility by sorting the trials in the periodic condition according to the listeners’ spoken responses. The comparison of intelligible and largely unintelligible trials revealed that the total EEG power in the delta band (1.7–2.7 Hz) was markedly increased during the second half of the intelligible trials, which suggests that delta oscillations are an indicator of successful speech understanding
    • …
    corecore