30 research outputs found

    Predictive Top-Down Integration of Prior Knowledge during Speech Perception

    Get PDF
    A striking feature of human perception is that our subjective experience depends not only on sensory information from the environment but also on our prior knowledge or expectations. The precise mechanisms by which sensory information and prior knowledge are integrated remain unclear, with longstanding disagreement concerning whether integration is strictly feedforward or whether higher-level knowledge influences sensory processing through feedback connections. Here we used concurrent EEG and MEG recordings to determine how sensory information and prior knowledge are integrated in the brain during speech perception. We manipulated listeners' prior knowledge of speech content by presenting matching, mismatching, or neutral written text before a degraded (noise-vocoded) spoken word. When speech conformed to prior knowledge, subjective perceptual clarity was enhanced. This enhancement in clarity was associated with a spatiotemporal profile of brain activity uniquely consistent with a feedback process: activity in the inferior frontal gyrus was modulated by prior knowledge before activity in lower-level sensory regions of the superior temporal gyrus. In parallel, we parametrically varied the level of speech degradation, and therefore the amount of sensory detail, so that changes in neural responses attributable to sensory information and prior knowledge could be directly compared. Although sensory detail and prior knowledge both enhanced speech clarity, they had an opposite influence on the evoked response in the superior temporal gyrus. We argue that these data are best explained within the framework of predictive coding in which sensory activity is compared with top-down predictions and only unexplained activity propagated through the cortical hierarchy

    Does training with amplitude modulated tones affect tone-vocoded speech perception?

    Get PDF
    Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored

    Retained capacity for perceptual learning of degraded speech in primary progressive aphasia and Alzheimer's disease

    Get PDF
    This work was supported by the Alzheimer’s Society (AS-PG-16-007), the National Institute for Health Research University College London Hospitals Biomedical Research Centre, the UCL Leonard Wolfson Experimental Neurology Centre (PR/ylr/18575) and the Economic and Social Research Council (ES/K006711/1). Individual authors were supported by the Medical Research Council (PhD Studentship to CJDH and RLB; MRC Clinician Scientist Fellowship to JDR), the Wolfson Foundation (Clinical Research Fellowship to CRM), Alzheimer’s Research UK (ART-SRF2010-3 to SJC) and the Wellcome Trust (091673/Z/10/Z to JDW)

    Human Decision Making Based on Variations in Internal Noise: An EEG Study

    Get PDF
    Perceptual decision making is prone to errors, especially near threshold. Physiological, behavioural and modeling studies suggest this is due to the intrinsic or ‘internal’ noise in neural systems, which derives from a mixture of bottom-up and top-down sources. We show here that internal noise can form the basis of perceptual decision making when the external signal lacks the required information for the decision. We recorded electroencephalographic (EEG) activity in listeners attempting to discriminate between identical tones. Since the acoustic signal was constant, bottom-up and top-down influences were under experimental control. We found that early cortical responses to the identical stimuli varied in global field power and topography according to the perceptual decision made, and activity preceding stimulus presentation could predict both later activity and behavioural decision. Our results suggest that activity variations induced by internal noise of both sensory and cognitive origin are sufficient to drive discrimination judgments

    Top-down influences of written text on perceived clarity of degraded speech

    Get PDF
    An unresolved question is how the reported clarity of degraded speech is enhanced when listeners have prior knowledge of speech content. One account of this phenomenon proposes top-down modulation of early acoustic processing by higher-level linguistic knowledge. Alternative, strictly bottom-up accounts argue that acoustic information and higher-level knowledge are combined at a late decision stage without modulating early acoustic processing. Here we tested top-down and bottom-up accounts using written text to manipulate listeners’ knowledge of speech content. The effect of written text on the reported clarity of noise-vocoded speech was most pronounced when text was presented before (rather than after) speech (Experiment 1). Fine-grained manipulation of the onset asynchrony between text and speech revealed that this effect declined when text was presented more than 120 ms after speech onset (Experiment 2). Finally, the influence of written text was found to arise from phonological (rather than lexical) correspondence between text and speech (Experiment 3). These results suggest that prior knowledge effects are time-limited by the duration of auditory echoic memory for degraded speech, consistent with top-down modulation of early acoustic processing by linguistic knowledge

    Neural oscillations track changes in speech rate shown by MEG adaptation and perceptual after-effects

    No full text
    Typical speech rates in conversation or broadcast media are around 150 to 200 words per minute. Yet, human listeners show an impressive degree of perceptual flexibility such that, with practice, they can understand speech presented at up to three times that rate (Dupoux & Green, 1997, JEP:HPP). However, exposure to time-compressed speech also leads to a perceptual after-effect: normal speech sounds unnaturally slow immediately after listening to time-compressed speech. Both these effects can be readily experienced using software built into most podcast players. However, the underlying functional and neural mechanisms that are responsible remain unspecified. In this work, we use behavioural and MEG experiments to explore the perceptual and neural processes that support speech rate adaptation and after-effects. We test whether and how these effects might arise from changes in delta and theta oscillations in the Superior Temporal Gyrus which track connected speech. In two behavioural studies, we first quantify the magnitude of the perceptual after-effect observed for 14 native English speakers listening to feature podcasts from The Guardian (@guardianaudio). In two experiments, we confirmed that: (1) participants report that speech at a natural speech rate sounds slower than normal after exposure to fast speech (50% time compression) and conversely that exposure to slowed speech (150% time expansion) leads listeners to report that natural speech sounds faster than normal. (2) Both these after-effects depend on the duration of the adaptation period; larger and long-lasting perceptual after-effects are observed after exposure to 60-seconds of time-compressed or expanded speech than after 20-seconds exposure. We also explored neural correlates of these perceptual adaptation and after-effects using MEG recordings from 16 native-English listeners. During an initial, 60-second period of natural speech we observed cluster-corrected significant cerebro-acoustic coherence (cf. Peelle, Gross & Davis, 2013, Cerebral Cortex) between auditory MEG responses and the amplitude envelope of speech in delta (0.1-3.2Hz) and theta (4.7-8.2Hz) ranges. During 40-second periods of adaptation to 60% time-compressed and 167% time-expanded speech we see significant increases (for 60% speech) and decreases (167% speech) in the peak frequency of delta but not theta entrainment. These effects build-up over time shown by a significant time (0-20sec, vs 20-40sec windows) by time-compression/expansion (60% vs 167%) interaction on the magnitude (F(2,30) = 45.81, p<.001) and peak frequency (F(2,30)=6.96, p<.01) of delta coherence. However, changes in the peak frequency of cerebro-acoustic coherence are smaller than the degree of compression/expansion applied to speech. This suggests a limit on the flexibility with which neural oscillations can entrain to speech at different rates despite speech remaining fully intelligible throughout. Although perceptual after-effects were pronounced in this group of listeners (confirmed by post-MEG behavioural data), these after-effects were not associated with any reliable change in the magnitude or frequency of cerebro-acoustic coherence. We are currently analysing multivariate temporal receptive fields (cf. Crosse et al, 2016, Frontiers Hum Neurosci) to determine whether differences in the timing of oscillatory entrainment are linked to perceptual adaptation or after-effects. These findings have implications for oscillatory accounts of speech perception and comprehension which will be discussed
    corecore