85 research outputs found

    Uncertainty and attention in audiovisual speech perception

    Get PDF
    This study deals with uncertainty and attention in audiovisual speech perception. Subjects were exposed to audiovisual stimuli where independent variable was bivalent: either blocked to one ear or uncertainty about in which ear the next stimulus would appear. The hypothesis, that uncertainty would inhibit audiovisual integration was motivated by an earlier study (Öhrström et al, 2011) but these results were not confirmed in this experiment. However, audiovisual stimuli presented in the final parts of the session evoked less visual influence than those in the first parts. This negative correlation may have two explanations (1) Fatigue in the end of the session means less available attentional resources, which inhibits integration. (2) Integration involves a late integration process where basically auditory information is stored throughout the session. This may haves implications for studies in audiovisual speech perception. Background As have been known for a long time, the visual signal has an additive role in speech comprehension, especially in noisy conditions (Sumby and Pollack, 1954; Erber, 1969). In the seminal work by McGurk and MacDonald (1976) it was shown that visual information is an important contributor to speech perception also when sound is not polluted by noise. In their study it was shown that a face articulating /gaga / synchronized with an auditorily presented /baba / was perceived as /dada/, i.e. a fusion between the two signals. In the reversed situation: auditory /gaga / together with visual /baba/, was perceived as a serial combination of the two signals (e.g. /gaba / or gabga). In a later study it has been shown that the visual signal influences perception of front vowel in such a way vowel openness is conveyed through the auditory channel while roundedness is conveyed through the visual signa

    Mouth and facial informativeness norms for 2276 English words

    Get PDF
    Mouth and facial movements are part and parcel of face-to-face communication. The primary way of assessing their role in speech perception has been by manipulating their presence (e.g., by blurring the area of a speaker's lips) or by looking at how informative different mouth patterns are for the corresponding phonemes (or visemes; e.g., /b/ is visually more salient than /g/). However, moving beyond informativeness of single phonemes is challenging due to coarticulation and language variations (to name just a few factors). Here, we present mouth and facial informativeness (MaFI) for words, i.e., how visually informative words are based on their corresponding mouth and facial movements. MaFI was quantified for 2276 English words, varying in length, frequency, and age of acquisition, using phonological distance between a word and participants' speechreading guesses. The results showed that MaFI norms capture well the dynamic nature of mouth and facial movements per word, with words containing phonemes with roundness and frontness features, as well as visemes characterized by lower lip tuck, lip rounding, and lip closure being visually more informative. We also showed that the more of these features there are in a word, the more informative it is based on mouth and facial movements. Finally, we demonstrated that the MaFI norms generalize across different variants of English language. The norms are freely accessible via Open Science Framework ( https://osf.io/mna8j/ ) and can benefit any language researcher using audiovisual stimuli (e.g., to control for the effect of speech-linked mouth and facial movements)

    Incongruent Visual Cues Affect the Perception of Mandarin Vowel But Not Tone

    Get PDF
    Over the recent few decades, a large number of audiovisual speech studies have been focusing on the visual cues of consonants and vowels but neglecting those relating to lexical tones. In this study, we investigate whether incongruent audiovisual information interfered with the perception of lexical tones. We found that, for both Chinese and English speakers, incongruence between auditory and visemic mouth shape (i.e., visual form information) significantly interfered with reaction time and reduced the identification accuracy of vowels. However, incongruent lip movements (i.e., visual timing information) did not interfere with the perception of auditory lexical tone. We conclude that, in contrast to vowel perception, auditory tone perception seems relatively impervious to visual congruence cues, at least under these restricted laboratory conditions. The salience of visual form and timing information is discussed based on this finding

    Effects of stimulus response compatibility on covert imitation of vowels

    Get PDF
    When we observe someone else speaking, we tend to automatically activate the corresponding speech motor patterns. When listening, we therefore covertly imitate the observed speech. Simulation theories of speech perception propose that covert imitation of speech motor patterns supports speech perception. Covert imitation of speech has been studied with interference paradigms, including the stimulus–response compatibility paradigm (SRC). The SRC paradigm measures covert imitation by comparing articulation of a prompt following exposure to a distracter. Responses tend to be faster for congruent than for incongruent distracters; thus, showing evidence of covert imitation. Simulation accounts propose a key role for covert imitation in speech perception. However, covert imitation has thus far only been demonstrated for a select class of speech sounds, namely consonants, and it is unclear whether covert imitation extends to vowels. We aimed to demonstrate that covert imitation effects as measured with the SRC paradigm extend to vowels, in two experiments. We examined whether covert imitation occurs for vowels in a consonant–vowel–consonant context in visual, audio, and audiovisual modalities. We presented the prompt at four time points to examine how covert imitation varied over the distracter’s duration. The results of both experiments clearly demonstrated covert imitation effects for vowels, thus supporting simulation theories of speech perception. Covert imitation was not affected by stimulus modality and was maximal for later time points

    Visual speaker gender affects vowel identification in Danish

    Get PDF

    Vowel reduction and loss: challenges and perspectives

    Get PDF
    This introduction gives an overview of a workshop on vowel reduction and loss held at SLE 2017 and the resulting papers collected here. It also discusses the present state of research on vowel reduction and loss in a number of perspectives and outlines the main themes dealt with throughout the course of this special issue

    Sharp and round shapes of seen objects have distinct influences on vowel and consonant articulation

    Get PDF
    The shape and size-related sound symbolism phenomena assume that, for example, the vowel [i] and the consonant [t] are associated with sharp-shaped and small-sized objects, whereas [E] and [m] are associated with round and large objects. It has been proposed that these phenomena are mostly based on the involvement of articulatory processes in representing shape and size properties of objects. For example, [i] might be associated with sharp and small objects, because it is produced by a specific front-close shape of articulators. Nevertheless, very little work has examined whether these object properties indeed have impact on speech sound vocalization. In the present study, the participants were presented with a sharp- or round-shaped object in a small or large size. They were required to pronounce one out of two meaningless speech units (e.g., [i] or [E]) according to the size or shape of the object. We investigated how a task-irrelevant object property (e.g., the shape when responses are made according to size) influences reaction times, accuracy, intensity, fundamental frequency, and formant 1 and formant 2 of vocalizations. The size did not influence vocal responses but shape did. Specifically, the vowel [i] and consonant [t] were vocalized relatively rapidly when the object was sharp-shaped, whereas [u] and [m] were vocalized relatively rapidly when the object was round-shaped. The study supports the view that the shape-related sound symbolism phenomena might reflect mapping of the perceived shape with the corresponding articulatory gestures.Peer reviewe

    Phonetic Realisation and Phonemic Categorisation of the Final Reduced Corner Vowels in the Finnic Languages of Ingria

    Get PDF
    Individual variability in sound change was explored at three stages of final vowel reduction and loss in the endangered Finnic varieties of Ingria (subdialects of Ingrian, Votic and Ingrian Finnish). The correlation between the realisation of reduced vowels and their phonemic categorisation by speakers was studied. The correlated results showed that if V was pronounced > 70%, its starting loss was not yet perceived, apart from certain frequent elements, but after > 70% loss, V was not perceived any more. A split of 50/50 between V and loss in production correlated with the same split in categorisation. At the beginning of a sound change, production is, therefore, more innovative, but after reanalysis, categorisation becomes more innovative and leads the change. The vowel a was the most innovative in terms of loss, u/o were the most conservative, and i was in the middle, while consonantal palatalisation was more salient than labialisation. These differences are based on acoustics, articulation and perception
    • …
    corecore