65 research outputs found
Temporal Coding of Voice Pitch Contours in Mandarin Tones
Accurate perception of time-variant pitch is important for speech recognition, particularly for tonal languages with different lexical tones such as Mandarin, in which different tones convey different semantic information. Previous studies reported that the auditory nerve and cochlear nucleus can encode different pitches through phase-locked neural activities. However, little is known about how the inferior colliculus (IC) encodes the time-variant periodicity pitch of natural speech. In this study, the Mandarin syllable /ba/ pronounced with four lexical tones (flat, rising, falling then rising and falling) were used as stimuli. Local field potentials (LFPs) and single neuron activity were simultaneously recorded from 90 sites within contralateral IC of six urethane-anesthetized and decerebrate guinea pigs in response to the four stimuli. Analysis of the temporal information of LFPs showed that 93% of the LFPs exhibited robust encoding of periodicity pitch. Pitch strength of LFPs derived from the autocorrelogram was significantly (p < 0.001) stronger for rising tones than flat and falling tones. Pitch strength are also significantly increased (p < 0.05) with the characteristic frequency (CF). On the other hand, only 47% (42 or 90) of single neuron activities were significantly synchronized to the fundamental frequency of the stimulus suggesting that the temporal spiking pattern of single IC neuron could encode the time variant periodicity pitch of speech robustly. The difference between the number of LFPs and single neurons that encode the time-variant F0 voice pitch supports the notion of a transition at the level of IC from direct temporal coding in the spike trains of individual neurons to other form of neural representation
Recommended from our members
Temporal coding of the periodicity of monaural and binaural complex tones in the guinea pig auditory brainstem
Humans report a strong pitch percept in response to a complex tone – the sum of a series of harmonics – presented to either a single ear (‘monaurally’) or both ears (‘diotically’). Interspike interval histograms of responses of neurons in the auditory system to monaural complex tones show a peak at the period of the pitch reported by humans – a ‘neural correlate of pitch’. However, the same pitch percept can be generated by presenting complexes with harmonics distributed across both ears (‘dichotically’). This requires combination of the neural signals underlying pitch from both sides of the auditory system, termed ‘binaural fusion’. Temporal coding generally deteriorates along the auditory pathway; binaural fusion should occur at a relatively early stage. One of the prime candidates is in the superior olivary complex (SOC).
Although the guinea pig auditory system has been extensively studied, this work is the first in vivo investigation of the guinea pig SOC. Cells of the lateral superior olive (LSO) show sensitivity to interaural level differences; medial superior olive (MSO) cells show sensitivity to interaural time differences. Additionally, cells with responses similar to the medial nucleus of the trapezoid body (MNTB) and superior paraolivary nucleus (SPN) of other species were found in the guinea pig SOC. Presumed MNTB cells showed a three-component spike waveform shape; presumed SPN cells responded at the offset of contralaterally-presented stimuli.
MSO and LSO cells respond to the overall pitch of complex tones, even if the monaural waveforms presented to each ear differ; this is consistent with the perception of humans. In contrast, cells of the ventral cochlear nucleus, which provide the main input to MSO and LSO cells, do not show evidence of a binaural pitch response. In conclusion, SOC cells are able to encode the pitch of binaural complex tones in their spike timing patterns.MRC Studentshi
Neural coding of pitch cues in the auditory midbrain of unanesthetized rabbits
Pitch is an important attribute of auditory perception that conveys key features in music, speech, and helps listeners extract useful information from complex auditory environments. Although the psychophysics of pitch perception has been extensively studied for over a century, the underlying neural mechanisms are still poorly understood. This thesis examines pitch cues in the inferior colliculus (IC), which is the core processing center in the mammalian auditory midbrain that relays and transforms convergent inputs from peripheral brainstem nuclei to the auditory cortex. Previous studies have shown that IC can encode low-frequency fluctuations in stimulus envelope that are related to pitch, but most experiments were conducted in anesthetized animals using stimuli that only evoked weak pitch sensations and only investigated a limited frequency range. Here, we used single-neuron recordings from the IC in normal hearing, unanesthetized rabbits in response to a comprehensive set of complex auditory stimuli to explore the role of IC in the neural processing of pitch. We characterized three neural codes for pitch cues: a temporal code for the stimulus envelope repetition rate (ERR) below 900 Hz, a rate code for ERR between 60 and 1600 Hz, and a rate-place code for frequency components individually resolved by the cochlea that is mainly available above 800 Hz. While the temporal code and the rate-place code are inherited from the auditory periphery, the rate code for ERR has not been currently characterized in processing stages prior to the IC. To help interpret our experimental findings, we used computational models to show that the IC rate code for ERR likely arises via temporal interaction of multiple synaptic inputs, and thus the IC performs a temporal-to-rate code transformation from peripheral to cortical representations of pitch cues. We also show that the IC rate-place code is robust across a 40 dB range of sound levels, and is likely strengthened by inhibitory synaptic inputs. Together, these three codes could provide neural substrates for pitch of stimuli with various temporal and spectral compositions over the entire frequency range
The neural representation and behavioral detection of frequency modulation
Understanding a speech signal is reliant on the ability of the auditory system to accurately encode rapidly changing spectral and temporal cues over time. Evidence from behavioral studies in humans suggests that relatively poor temporal fine structure (TFS) encoding ability is correlated with poorer performance on speech understanding tasks in quiet and in noise. Electroencephalography, including measurement of the frequency-following response, has been used to assess the human central auditory nervous system’s ability to encode temporal patterns in steady-state and dynamic tonal stimuli and short syllables. To date, the FFR has been used to investigate the accuracy of phase-locked auditory encoding of various stimuli, however, no study has demonstrated an FFR evoked by dynamic TFS contained in the modulating frequency content of a carrier tone. Furthermore, the relationship between a physiological representation of TFS encoding and either behavioral perception or speech-in-noise understanding has not been studied. The present study investigated the feasibility of eliciting FFRs in young, normal-hearing listeners using frequency-modulated (FM) tones, which contain TFS. Brainstem responses were compared to the behavioral detection of frequency modulation as well as speech-in-noise understanding. FFRs in response to FM tones were obtained from all listeners, indicating a reliable measurement of TFS encoding within the brainstem. FFRs were more accurate at lower carrier frequencies and at shallower FM depths. FM detection ability was consistent with previously reported findings in normal-hearing listeners. In the present study, however, FFR accuracy was not predictive of behavioral performance. Additionally, FFR accuracy was not predictive of speech-in-noise understanding. Further investigation of brainstem encoding of TFS may reveal a stronger brain-behavior relationship across an age continuum
Language experience enhances early cortical pitch-dependent responses
AbstractPitch processing at cortical and subcortical stages of processing is shaped by language experience. We recently demonstrated that specific components of the cortical pitch response (CPR) index the more rapidly-changing portions of the high rising Tone 2 of Mandarin Chinese, in addition to marking pitch onset and sound offset. In this study, we examine how language experience (Mandarin vs. English) shapes the processing of different temporal attributes of pitch reflected in the CPR components using stimuli representative of within-category variants of Tone 2. Results showed that the magnitude of CPR components (Na–Pb and Pb–Nb) and the correlation between these two components and pitch acceleration were stronger for the Chinese listeners compared to English listeners for stimuli that fell within the range of Tone 2 citation forms. Discriminant function analysis revealed that the Na–Pb component was more than twice as important as Pb–Nb in grouping listeners by language affiliation. In addition, a stronger stimulus-dependent, rightward asymmetry was observed for the Chinese group at the temporal, but not frontal, electrode sites. This finding may reflect selective recruitment of experience-dependent, pitch-specific mechanisms in right auditory cortex to extract more complex, time-varying pitch patterns. Taken together, these findings suggest that long-term language experience shapes early sensory level processing of pitch in the auditory cortex, and that the sensitivity of the CPR may vary depending on the relative linguistic importance of specific temporal attributes of dynamic pitch
The role of sound offsets in auditory temporal processing and perception
Sound-offset responses are distinct to sound onsets in their underlying neural mechanisms, temporal processing pathways and roles in auditory perception following recent neurobiological studies. In this work, I investigate the role of sound offsets and the effect of reduced sensitivity to offsets on auditory perception in humans. The implications of a 'sound-offset deficit' for speech-in-noise perception are investigated, based on a mathematical model with biological significance and independent channels for onset and offset detection. Sound offsets are important in recognising, distinguishing and grouping sounds. They are also likely to play a role in perceiving consonants that lie in the troughs of amplitude fluctuations in speech. The offset influence on the discriminability of model outputs for 48 non-sense vowel-consonant-vowel (VCV) speech stimuli in varying levels of multi-talker babble noise (-12, -6, 0, 6, 12 dB SNR) was assessed, and led to predictions that correspond to known phonetic categories. This work therefore suggests that variability in the offset salience alone can explain the rank order of consonants most affected in noisy situations. A novel psychophysical test battery for offset sensitivity was devised and assessed, followed by a study to find an electrophysiological correlate. The findings suggest that individual differences in sound-offset sensitivity may be a factor contributing to inter-subject variation in speech-in-noise discrimination ability. The promising measures from these results can be used to test between-population differences in offset sensitivity, with more support for objective than psychophysical measures. In the electrophysiological study, offset responses in a duration discrimination paradigm were found to be modulated by attention compared to onset responses. Overall, this thesis shows for the first time that the onset-offset dichotomy in the auditory system, previously explored in physiological studies, is also evident in human studies for both simple and complex speech sounds
Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing
otorhinolaryngology; neurosciences; hearin
- …