682 research outputs found

    Investigating the Neural Basis of Audiovisual Speech Perception with Intracranial Recordings in Humans

    Get PDF
    Speech is inherently multisensory, containing auditory information from the voice and visual information from the mouth movements of the talker. Hearing the voice is usually sufficient to understand speech, however in noisy environments or when audition is impaired due to aging or disabilities, seeing mouth movements greatly improves speech perception. Although behavioral studies have well established this perceptual benefit, it is still not clear how the brain processes visual information from mouth movements to improve speech perception. To clarify this issue, I studied the neural activity recorded from the brain surfaces of human subjects using intracranial electrodes, a technique known as electrocorticography (ECoG). First, I studied responses to noisy speech in the auditory cortex, specifically in the superior temporal gyrus (STG). Previous studies identified the anterior parts of the STG as unisensory, responding only to auditory stimulus. On the other hand, posterior parts of the STG are known to be multisensory, responding to both auditory and visual stimuli, which makes it a key region for audiovisual speech perception. I examined how these different parts of the STG respond to clear versus noisy speech. I found that noisy speech decreased the amplitude and increased the across-trial variability of the response in the anterior STG. However, possibly due to its multisensory composition, posterior STG was not as sensitive to auditory noise as the anterior STG and responded similarly to clear and noisy speech. I also found that these two response patterns in the STG were separated by a sharp boundary demarcated by the posterior-most portion of the Heschl’s gyrus. Second, I studied responses to silent speech in the visual cortex. Previous studies demonstrated that visual cortex shows response enhancement when the auditory component of speech is noisy or absent, however it was not clear which regions of the visual cortex specifically show this response enhancement and whether this response enhancement is a result of top-down modulation from a higher region. To test this, I first mapped the receptive fields of different regions in the visual cortex and then measured their responses to visual (silent) and audiovisual speech stimuli. I found that visual regions that have central receptive fields show greater response enhancement to visual speech, possibly because these regions receive more visual information from mouth movements. I found similar response enhancement to visual speech in frontal cortex, specifically in the inferior frontal gyrus, premotor and dorsolateral prefrontal cortices, which have been implicated in speech reading in previous studies. I showed that these frontal regions display strong functional connectivity with visual regions that have central receptive fields during speech perception

    The Role of Speech Production System in Audiovisual Speech Perception

    Get PDF
    Seeing the articulatory gestures of the speaker significantly enhances speech perception. Findings from recent neuroimaging studies suggest that activation of the speech motor system during lipreading enhance speech perception by tuning, in a top-down fashion, speech-sound processing in the superior aspects of the posterior temporal lobe. Anatomically, the superior-posterior temporal lobe areas receive connections from the auditory, visual, and speech motor cortical areas. Thus, it is possible that neuronal receptive fields are shaped during development to respond to speech-sound features that coincide with visual and motor speech cues, in contrast with the anterior/lateral temporal lobe areas that might process speech sounds predominantly based on acoustic cues. The superior-posterior temporal lobe areas have also been consistently associated with auditory spatial processing. Thus, the involvement of these areas in audiovisual speech perception might partly be explained by the spatial processing requirements when associating sounds, seen articulations, and one’s own motor movements. Tentatively, it is possible that the anterior “what” and posterior “where / how” auditory cortical processing pathways are parts of an interacting network, the instantaneous state of which determines what one ultimately perceives, as potentially reflected in the dynamics of oscillatory activity

    Cortical mechanisms of seeing and hearing speech

    Get PDF
    In face-to-face communication speech is perceived through eyes and ears. The talker's articulatory gestures are seen and the speech sounds are heard simultaneously. Whilst acoustic speech can be often understood without visual information, viewing articulatory gestures aids hearing substantially in noisy conditions. On the other hand, speech can be understood, to some extent, by solely viewing articulatory gestures (i.e., by speechreading). In this thesis, electroencephalography (EEG), magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) were utilized to disclose cortical mechanisms of seeing and hearing speech. One of the major challenges of modern cognitive neuroscience is to find out how the brain integrates inputs from different senses. In this thesis, integration of seen and heard speech was investigated using EEG and MEG. Multisensory interactions were found in the sensory-specific cortices at early latencies and in the multisensory regions at late latencies. Viewing other person's actions activate regions belonging to the human mirror neuron system (MNS) which are also activated when subjects themselves perform actions. Possibly, the human MNS enables simulation of other person's actions, which might be important also for speech recognition. In this thesis, it was demonstrated with MEG that seeing speech modulates activity in the mouth region of the primary somatosensory cortex (SI), suggesting that also the SI cortex is involved in simulation of other person's articulatory gestures during speechreading. The question whether there are speech-specific mechanisms in the human brain has been under scientific debate for decades. In this thesis, evidence for the speech-specific neural substrate in the left posterior superior temporal sulcus (STS) was obtained using fMRI. Activity in this region was found to be greater when subjects heard acoustic sine wave speech stimuli as speech than when they heard the same stimuli as non-speech.reviewe

    Voice and speech perception in autism : a systematic review

    Get PDF
    Autism spectrum disorders (ASD) are characterized by persistent impairments in social communication and interaction, restricted and repetitive behavior. In the original description of autism by Kanner (1943) the presence of emotional impairments was already emphasized (self-absorbed, emotionally cold, distanced, and retracted). However, little research has been conducted focusing on auditory perception of vocal emotional cues, being the audio-visual comprehension most commonly explored instead. Similarly to faces, voices play an important role in social interaction contexts in which individuals with ASD show impairments. The aim of the current systematic review was to integrate evidence from behavioral and neurobiological studies for a more comprehensive understanding of voice processing abnormalities in ASD. Among different types of information that the human voice may provide, we hypothesize particular deficits with vocal affect information processing by individuals with ASD. The relationship between vocal stimuli impairments and disrupted Theory of Mind in Autism is discussed. Moreover, because ASD are characterized by deficits in social reciprocity, further discussion of the abnormal oxytocin system in individuals with ASD is performed as a possible biological marker for abnormal vocal affect information processing and social interaction skills in ASD population

    The neurobiology of speech perception decline in aging

    Get PDF
    Speech perception difficulties are common amongst elderlies; yet the underlying neural mechanisms are still poorly understood. New empirical evidence suggesting that brain senescence may be an important contributor to these difficulties have challenged the traditional view that peripheral hearing loss was the main factor in the aetiology of these difficulties. Here we investigated the relationship between structural and functional brain senescence and speech perception skills in aging. Following audiometric evaluations, participants underwent MRI while performing a speech perception task at different intelligibility levels. As expected, with age speech perception declined, even after controlling for hearing sensitivity using an audiological measure (pure tone averages), and a bioacoustical measure (DPOAEs recordings). Our results reveal that the core speech network, centered on the supratemporal cortex and ventral motor areas bilaterally, decreased in spatial extent in older adults. Importantly, our results also show that speech skills in aging are affected by changes in cortical thickness and in brain functioning. Age-independent intelligibility effects were found in several motor and premotor areas, including the left ventral premotor cortex and the right SMA. Agedependent intelligibility effects were also found, mainly in sensorimotor cortical areas, and in the left dorsal anterior insula. In this region, changes in BOLD signal had an effect on the relationship of age to speech perception skills suggesting a role for this region in maintaining speech perception in older ages perhaps by. These results provide important new insights into the neurobiology of speech perception in aging

    On Experiencing Meaning: Irreducible Cognitive Phenomenology and Sinewave Speech

    Get PDF
    Upon first hearing sinewaves, all that can be discerned are beeps and whistles. But after hearing the original speech, the beeps and whistles sound like speech. The difference between these two episodes undoubtedly involves an alteration in phenomenal character. O’Callaghan (2011) argues that this alteration is non-sensory, but he leaves open the possibility of attributing it to some other source, e.g. cognition. I discuss whether the alteration in phenomenal character involved in sinewave speech provides evidence for cognitive phenomenology. I defend both the existence of cognitive phenomenology and the phenomenal contrast method, as each concerns the case presented here

    Specialization along the Left Superior Temporal Sulcus for Auditory Categorization

    Get PDF
    The affinity and temporal course of functional fields in middle and posterior superior temporal cortex for the categorization of complex sounds was examined using functional magnetic resonance imaging (fMRI) and event-related potentials (ERPs) recorded simultaneously. Data were compared before and after subjects were trained to categorize a continuum of unfamiliar nonphonemic auditory patterns with speech-like properties (NP) and a continuum of familiar phonemic patterns (P). fMRI activation for NP increased after training in left posterior superior temporal sulcus (pSTS). The ERP P2 response to NP also increased with training, and its scalp topography was consistent with left posterior superior temporal generators. In contrast, the left middle superior temporal sulcus (mSTS) showed fMRI activation only for P, and this response was not affected by training. The P2 response to P was also independent of training, and its estimated source was more anterior in left superior temporal cortex. Results are consistent with a role for left pSTS in short-term representation of relevant sound features that provide the basis for identifying newly acquired sound categories. Categorization of highly familiar phonemic patterns is mediated by long-term representations in left mSTS. Results provide new insight regarding the function of ventral and dorsal auditory streams

    Categorical representations of phonemic vowels investigated with fMRI

    Get PDF
    The present thesis investigates the sensitivity of the human auditory cortex (AC) to the contrast between prototype and nonprototype vowels as well as between phonemic and nonphonemic vowels. Activations to vowels were measured with functional magnetic resonance imaging (fMRI), which was also used to analyze the effect of categorical processing on modulations in AC and adjacent inferior parietal lobule (IPL) observed during active listening tasks. A prominent theoretical view suggests that native phonemic vowels (i.e., phonemes) are represented in the human brain as categories organized around a best representative of the category (i.e., phoneme prototype). This view predicts systematic differences in the neural representations and processing of phoneme prototypes, nonprototypes and nonphonemic vowels. In three separate studies, subjects were presented with vowel pairs and visual stimuli during demanding auditory and visual tasks. Study I compared activations to prototypical and nonprototypical vowels, whereas Study II focused on the contrast between phonemic and nonphonemic vowels. Study II also tested whether activations in IPL during a categorical vowel memory task depend on whether the task is performed on phonemic (easy to categorize) or nonphonemic (harder to categorize) vowels. Study III was designed to replicate the key findings of Studies I and II. Further, Study III compared activations to identical vowels presented during a number of different task conditions requiring analysis of the acoustical or categorical differences between the vowels. The results of this thesis are in line with the general theoretical view that phonemic vowels are represented in a categorical manner in the human brain. Studies I–III showed that information about categorical vowel representations is present in human AC during active listening tasks. Areas of IPL, in turn, were implicated in general operations on categorical representations rather than in categorization of speech sounds as such. Further, the present results demonstrate that task-dependent activations in AC and adjacent IPL strongly depend on whether the task requires analysis of the acoustical or categorical features of the vowels. It is important to note that, in the present studies, surprisingly small differences in the characteristics of the vowel stimuli or the tasks performed on these vowels resulted in significant and widespread activation differences in AC and adjacent regions. As the key findings of Studies I and II were also quite successfully replicated in Study III, these results highlight the importance of carefully controlled experiments and replications in fMRI research.Vallitsevan teorian mukaan Ă€idinkielen foneemisten vokaalien (eli foneemien) representaatiot ovat luonteeltaan kategorisia eli ÀÀnteet ovat jĂ€rjestyneet kategorian parhaan esiintymĂ€n (eli foneemin prototyypin) ympĂ€rille. Teoria ennustaa, ettĂ€ foneemiprototyyppien, nonprototyyppien ja nonfoneemisten vokaalien representaatiot ja niiden kĂ€sittely aivoissa eroavat toisistaan. TĂ€ssĂ€ vĂ€itöskirjassa selvitetÀÀn toiminnallisen magneettikuvauksen (fMRI) avulla, eroaako ihmisen kuuloaivokuoren ja sen lĂ€hialueiden aktivaatio prototyyppisten ja nonprototyyppisten sekĂ€ foneemisten ja nonfoneemisten vokaalien kĂ€sittelyn aikana ja miten aktiiviset kuuntelutehtĂ€vĂ€t vaikuttavat nĂ€ihin eroihin. VĂ€itöskirjan kolmessa osatutkimuksessa koehenkilöille esitettiin vokaalipareja ja visuaalisia Ă€rsykkeitĂ€ kuuntelu- ja katselutehtĂ€vien aikana. Tutkimuksessa I vokaaliparit koostuivat prototyyppisistĂ€ ja nonprototyyppisistĂ€ foneemisista vokaaleista. Tutkimuksessa II puolestaan tarkasteltiin foneemisten ja nonfoneemisten vokaalien kĂ€sittelyn eroa. LisĂ€ksi tutkimuksessa II selvitettiin, riippuuko kuuloaivokuoren ja sen lĂ€hialueiden tehtĂ€vĂ€sidonnainen aktivaatio siitĂ€, suoritetaanko kuuntelutehtĂ€vÀÀ helposti (foneemiset vokaalit) vai vaikeasti (nonfoneemiset vokaalit) kategorisoitavilla vokaaleilla. ViimeisessĂ€ tutkimuksessa (III) toistettiin tutkimusten I ja II pÀÀtulokset. Tutkimuksessa III selvitettiin myös sitĂ€, miten aktivaatio ÀÀnten erottelutehtĂ€vĂ€n aikana eroaa silloin, kun vokaaleja erotellaan niiden fysikaalisten tai kategoristen ominaisuuksien perusteella. VĂ€itöskirjan tutkimuksissa saadut tulokset tukevat oletetusta siitĂ€, ettĂ€ foneemisten vokaalien representaatiot ovat luonteeltaan kategorisia. Tulokset osoittavat, ettĂ€ tieto vokaalikategorioiden representaatioista on kĂ€ytettĂ€vissĂ€ kuuloaivokuorella aktiivisten kuuntelutehtĂ€vien aikana. LisĂ€ksi tutkimuksissa havaittiin, ettĂ€ pÀÀlakilohkon alaosien aktivaatio voimistui niiden tehtĂ€vien aikana, jotka vaativat kategorisen tiedon kĂ€sittelyĂ€. NĂ€iden aivoalueiden aktivaatio ei kuitenkaan nĂ€ytĂ€ liittyvĂ€n kategorioiden muodostamiseen (puheÀÀnten kategorisointiin) sinĂ€nsĂ€. On merkillepantavaa, ettĂ€ tutkimuksissa I–III nĂ€ennĂ€isen pienet erot vokaaliĂ€rsykkeissĂ€ ja kuuntelutehtĂ€vissĂ€ johtivat huomattaviin aktivaatioeroihin kuuloaivokuorella ja sen lĂ€heisillĂ€ aivoalueilla. TĂ€mĂ€ korostaa huolellisesti kontrolloitujen koeasetelmien ja etenkin replikaatiotutkimusten tĂ€rkeyttĂ€ fMRI -tutkimuksissa

    Orienting asymmetries in dogs’ responses to different communicatory components of human speech

    Get PDF
    It is well established that in human speech perception the left hemisphere (LH) of the brain is specialized for processing intelligible phonemic (segmental) content (e.g., [1–3]), whereas the right hemisphere (RH) is more sensitive to pro- sodic (suprasegmental) cues [4, 5]. Despite evidence that a range of mammal species show LH specialization when pro- cessing conspecific vocalizations [6], the presence of hemi- spheric biases in domesticated animals’ responses to the communicative components of human speech has never been investigated. Human speech is familiar and relevant to domestic dogs (Canis familiaris), who are known to perceive both segmental phonemic cues [7–10] and supra- segmental speaker-related [11, 12] and emotional [13] proso- dic cues. Using the head-orienting paradigm, we presented dogs with manipulated speech and tones differing in segmental or suprasegmental content and recorded their orienting responses. We found that dogs showed a sig- nificant LH bias when presented with a familiar spoken command in which the salience of meaningful phonemic (segmental) cues was artificially increased but a significant RH bias in response to commands in which the salience of intonational or speaker-related (suprasegmental) vocal cues was increased. Our results provide insights into mech- anisms of interspecific vocal perception in a domesticated mammal and suggest that dogs may share ancestral or convergent hemispheric specializations for processing the different functional communicative components of speech with human listeners
    • 

    corecore