306 research outputs found

    Congruency effect between articulation and grasping in native English speakers

    Get PDF
    Previous studies have shown congruency effects between specific speech articulations and manual grasping actions. For example, uttering the syllable [kɑ] facilitates power grip responses in terms of reaction time and response accuracy. A similar association of the syllable [ti] with precision grip has also been observed. As these congruency effects have been to date shown only for Finnish native speakers, this study explored whether the congruency effects generalize to native speakers of another language. The original experiments were therefore replicated with English participants (N=16). Several previous findings were reproduced, namely the association of syllables [kɑ] and [ke] with power grip and of [ti] and [te] with precision grip. However, the association of vowels [ɑ] and [i] with power and precision grip, respectively, previously found for Finnish participants, was not significant for English speakers. This difference could be related to ambiguities of English orthography and pronunciation variations. It is possible that for English speakers seeing a certain written vowel activates several different phonological representations associated with that letter. If the congruency effects are based on interactions between specific phonological representations and grasp actions, this ambiguity might lead to weakening of the effects in the manner demonstrated here

    Sharp and round shapes of seen objects have distinct influences on vowel and consonant articulation

    Get PDF
    The shape and size-related sound symbolism phenomena assume that, for example, the vowel [i] and the consonant [t] are associated with sharp-shaped and small-sized objects, whereas [E] and [m] are associated with round and large objects. It has been proposed that these phenomena are mostly based on the involvement of articulatory processes in representing shape and size properties of objects. For example, [i] might be associated with sharp and small objects, because it is produced by a specific front-close shape of articulators. Nevertheless, very little work has examined whether these object properties indeed have impact on speech sound vocalization. In the present study, the participants were presented with a sharp- or round-shaped object in a small or large size. They were required to pronounce one out of two meaningless speech units (e.g., [i] or [E]) according to the size or shape of the object. We investigated how a task-irrelevant object property (e.g., the shape when responses are made according to size) influences reaction times, accuracy, intensity, fundamental frequency, and formant 1 and formant 2 of vocalizations. The size did not influence vocal responses but shape did. Specifically, the vowel [i] and consonant [t] were vocalized relatively rapidly when the object was sharp-shaped, whereas [u] and [m] were vocalized relatively rapidly when the object was round-shaped. The study supports the view that the shape-related sound symbolism phenomena might reflect mapping of the perceived shape with the corresponding articulatory gestures.Peer reviewe

    Interaction between grasping and articulation : How vowel and consonant pronunciation influences precision and power grip responses

    Get PDF
    Grasping and mouth movements have been proposed to be integrated anatomically, functionally and evolutionarily. In line with this, we have shown that there is a systematic interaction between particular speech units and grip performance. For example, when the task requires pronouncing a speech unit simultaneously with grasp response, the speech units [i] and [t] are associated with relatively rapid and accurate precision grip responses, while [ɑ] and [k] are associated with power grip responses. This study is aimed at complementing the picture about which vowels and consonants are associated with these grasp types. The study validated our view that the high-front vowels and the alveolar consonants are associated with precision grip responses, while low and high-back vowels as well as velar consonants or those whose articulation involves the lowering of the tongue body are associated with power grip responses. This paper also proposes that one reason why small/large concepts are associated with specific speech sounds in the sound-magnitude symbolism is because articulation of these sounds is programmed within the overlapping mechanisms of precision or power grasping.Peer reviewe

    The Influence of Number Magnitude on Vocal Responses

    Get PDF
    The study investigated whether number magnitude can influence vocal responses. Participants produced either short or long version of the vowel [&] (Experiment 1), or high or low-pitched version of that vowel (Experiment 2), according to the parity of a visually presented number. In addition to measuring reaction times (RT) of vocal responses, we measured the intensity, the fundamental frequency (f(0)) and the first and second formants of the vocalization. The RTs showed that the long and high-pitched vocal responses were associated with large numbers, while short and low-pitched vocal responses were associated with small numbers. It was also found that high-pitched vocalizations were mapped with the odd numbers, while the low-pitched vocalizations were mapped with the even numbers. Finally, large numbers increased the f(0) values. The study shows systematic interactions between the processes that represent number magnitude and produce vocal responses.Peer reviewe

    Connections between articulations and grasping

    Get PDF
    The idea that hand gestures and speech are connected is quite old. Some of these theories even suggest that language is primarily based on a manual communication system. In this thesis, I present four studies in which we studied the connections between articulatory gestures and manual grasps. The work is based on an earlier finding showing systematic connections between specific articulatory gestures and grasp types. For example, uttering a syllable such as [kɑ] can facilitate power grip responses, whereas uttering a syllable such as [ti] can facilitate precision grip responses. I will refer to this phenomenon as the articulation-grip congruency effect. Similarly, to the original work, we used special power and precision grip devices that the participants held in their hand to perform responses. In Study I, we measured response times and accuracy of grip responses and vocalisations to investigate whether the effect can be also observed in vocal responses, and to which extent the effect operates in the action selection processes. In Study II, grip response times were measured to investigate whether the effect persists when the syllables are only heard or read silently. Study III investigated the influence of grasp planning and/or execution on categorizing perceived syllables. In Study IV, we measured electrical activity in the brain during listening of syllables that were either congruent or incongruent with the precision or power grip, and we investigated how performing different grips affected the auditory processing of the heard syllables. The results of Study I showed that besides manual facilitation, the effect is observed also in vocal responses, both when a simultaneous grip is executed and when it is only prepared, meaning that overt execution is not needed for the effect. This suggests that the effect operates in action planning. In addition, the effect was also observed when the participants knew beforehand which response they should execute, suggesting that the effect is not based on the action selection processes. Study II showed that the effect was also observed when the syllables were heard or read silently, supporting the view that articulatory simulation of a perceived syllable can activate the motor program of the grasp which is congruent with the syllable. Study III revealed that grip preparation can influence categorization of perceived syllables. The participants were biased to categorize noise-masked syllables as being [ke] rather than [te] when they were prepared to execute the power grip, and vice versa when they were prepared to execute the precision grip. Finally, Study IV showed that grip performance also modulates early auditory processing of heard syllables. These results support the view that articulatory and hand motor representations form a partly shared network, where activity from one domain can induce activity in the other. This is in line with earlier studies that have shown more general linkage between mouth and manual processes and expands this notion of hand-mouth interaction by showing that these connections can also operate between very specific hand and articulatory gestures.Ajatus käden eleiden ja puheen välisistä yhteyksistä on melko vanha. Jotkut teoriat jopa ehdottavat, että kieli pohjautuu pääosin käsillä tapahtuvaan kommunikointijärjestelmään. Tässä väitöskirjassa esittelen neljä osatyötä, joissa tutkimme artikulatoristen eleiden ja tarttumisotteiden välisiä yhteyksiä. Työ perustuu aiempaan löydökseen, joka paljasti systemaattisia yhteyksiä tiettyjen artikulatoristen eleiden ja tarttumisotteiden välillä. Esimerkiksi [kɑ] tavun lausuminen nopeuttaa voimaotteen tekemistä, kun taas esimerkiksi [ti] tavun lausuminen nopeuttaa pinsettiotteen tekemistä. Väitöskirjan osatyöt hyödynsivät tätä perusefektiä muokkaamalla koeasetelmaa kuhunkin tutkimuskysymykseen sopivaksi. Osatyön I tulokset osoittivat, että yhteensopivuusefekti on havaittavissa myös lausutuissa vastauksissa. Efekti havaittiin myös, kun otteen suorittamiseen oli vain valmistauduttu. Tämä viittaa siihen, että efekti toimii toimintojen suunnittelun tasolla. Lisäksi efekti havaittiin silloinkin, kun osallistujat tiesivät etukäteen, mikä vastaus heidän tulisi suorittaa, mikä viittaa siihen, ettei efekti perustu toimintojen valintaan liittyviin prosesseihin. Osatyössä II efekti havaittiin, vaikka tavut vain kuultiin tai luettiin äänettömästä. Tämä tukee näkemystä, että havaittujen tavujen artikulatorinen simulointi voi aktivoida tavun kanssa yhteensopivan otteen motorista ohjelmaa. Osatyö III osoitti, että käden otteet voivat vaikuttaa havaittujen tavujen luokitteluun. Osallistujat olivat biasoituneet luokittelemaan esitettyjen kohinaisten tavujen olevan ennemmin [ke] kuin [te], kun he olivat valmistautuneet suorittamaan voimaotteen ja päinvastoin, kun he olivat valmistautuneet pinsettiotteen suorittamiseen. Viimeisimpänä osatyö IV osoitti, että otteiden suorittaminen vaikuttaa myös havaittujen tavujen varhaiseen auditoriseen prosessointiin. Nämä tulokset tukevat näkemystä, että artikulatoriset ja käden motoriset edustukset muodostavat osittain jaetun verkoston, jossa aktiivisuus yhdellä osa-alueella voi aiheuttaa aktiivisuutta myös toisella. Tämä on linjassa aiheen aiempien tutkimusten kanssa, jotka ovat osoittaneet yleisempiä yhteyksiä käden ja suun toimintojen välillä. Nämä tulokset laajentavat käden ja suun välisen yhteyden ajatusta osoittamalla, että yhteydet voivat toimia myös hyvin tarkasti rajattujen artikulatoristen ja käden eleiden välillä

    Chinese Tones: Can You Listen With Your Eyes?:The Influence of Visual Information on Auditory Perception of Chinese Tones

    Get PDF
    CHINESE TONES: CAN YOU LISTEN WITH YOUR EYES? The Influence of Visual Information on Auditory Perception of Chinese Tones YUEQIAO HAN Summary Considering the fact that more than half of the languages spoken in the world (60%-70%) are so-called tone languages (Yip, 2002), and tone is notoriously difficult to learn for westerners, this dissertation focused on tone perception in Mandarin Chinese by tone-naïve speakers. Moreover, it has been shown that speech perception is more than just an auditory phenomenon, especially in situations when the speaker’s face is visible. Therefore, the aim of this dissertation is to also study the value of visual information (over and above that of acoustic information) in Mandarin tone perception for tone-naïve perceivers, in combination with other contextual (such as speaking style) and individual factors (such as musical background). Consequently, this dissertation assesses the relative strength of acoustic and visual information in tone perception and tone classification. In the first two empirical and exploratory studies in Chapter 2 and 3 , we set out to investigate to what extent tone-naïve perceivers are able to identify Mandarin Chinese tones in isolated words, and whether or not they can benefit from (seeing) the speakers’ face, and what the contribution is of a hyperarticulated speaking style, and/or their own musical experience. Respectively, in Chapter 2 we investigated the effect of visual cues (comparing audio-only with audio-visual presentations) and speaking style (comparing a natural speaking style with a teaching speaking style) on the perception of Mandarin tones by tone-naïve listeners, looking both at the relative strength of these two factors and their possible interactions; Chapter 3 was concerned with the effects of musicality of the participants (combined with modality) on Mandarin tone perception. In both of these studies, a Mandarin Chinese tone identification experiment was conducted: native speakers of a non-tonal language were asked to distinguish Mandarin Chinese tones based on audio (-only) or video (audio-visual) materials. In order to include variations, the experimental stimuli were recorded using four different speakers in imagined natural and teaching speaking scenarios. The proportion of correct responses (and average reaction times) of the participants were reported. The tone identification experiment presented in Chapter 2 showed that the video conditions (audio-visual natural and audio-visual teaching) resulted in an overall higher accuracy in tone perception than the auditory-only conditions (audio-only natural and audio-only teaching), but no better performance was observed in the audio-visual conditions in terms of reaction time, compared to the auditory-only conditions. Teaching style turned out to make no difference on the speed or accuracy of Mandarin tone perception (as compared to a natural speaking style). Further on, we presented the same experimental materials and procedure in Chapter 3 , but now with musicians and non-musicians as participants. The Goldsmith Musical Sophistication Index (Gold-MSI) was used to assess the musical aptitude of the participants. The data showed that overall, musicians outperformed non-musicians in the tone identification task in both auditory-visual and auditory-only conditions. Both groups identified tones more accurately in the auditory-visual conditions than in the auditory-only conditions. These results provided further evidence for the view that the availability of visual cues along with auditory information is useful for people who have no knowledge of Mandarin Chinese tones when they need to learn to identify these tones. Out of all the musical skills measured by Gold-MSI, the amount of musical training was the only predictor that had an impact on the accuracy of Mandarin tone perception. These findings suggest that learning to perceive Mandarin tones benefits from musical expertise, and visual information can facilitate Mandarin tone identification, but mainly for tone-naïve non-musicians. In addition, performance differed by tone: musicality improves accuracy for every tone; some tones are easier to identify than others: in particular, the identification of tone 3 (a low-falling-rising) proved to be the easiest, while tone 4 (a high-falling tone) was the most difficult to identify for all participants. The results of the first two experiments presented in chapters 2 and 3 showed that adding visual cues to clear auditory information facilitated the tone identification for tone-naïve perceivers (there is a significantly higher accuracy in audio-visual condition(s) than in auditory-only condition(s)). This visual facilitation was unaffected by the presence of (hyperarticulated) speaking style or the musical skill of the participants. Moreover, variations in speakers and tones had effects on the accurate identification of Mandarin tones by tone-naïve perceivers. In Chapter 4 , we compared the relative contribution of auditory and visual information during Mandarin Chinese tone perception. More specifically, we aimed to answer two questions: firstly, whether or not there is audio-visual integration at the tone level (i.e., we explored perceptual fusion between auditory and visual information). Secondly, we studied how visual information affects tone perception for native speakers and non-native (tone-naïve) speakers. To do this, we constructed various tone combinations of congruent (e.g., an auditory tone 1 paired with a visual tone 1, written as AxVx) and incongruent (e.g., an auditory tone 1 paired with a visual tone 2, written as AxVy) auditory-visual materials and presented them to native speakers of Mandarin Chinese and speakers of tone-naïve languages. Accuracy, defined as the percentage correct identification of a tone based on its auditory realization, was reported. When comparing the relative contribution of auditory and visual information during Mandarin Chinese tone perception with congruent and incongruent auditory and visual Chinese material for native speakers of Chinese and non-tonal languages, we found that visual information did not significantly contribute to the tone identification for native speakers of Mandarin Chinese. When there is a discrepancy between visual cues and acoustic information, (native and tone-naïve) participants tend to rely more on the auditory input than on the visual cues. Unlike the native speakers of Mandarin Chinese, tone-naïve participants were significantly influenced by the visual information during their auditory-visual integration, and they identified tones more accurately in congruent stimuli than in incongruent stimuli. In line with our previous work, the tone confusion matrix showed that tone identification varies with individual tones, with tone 3 (the low-dipping tone) being the easiest one to identify, whereas tone 4 (the high-falling tone) was the most difficult one. The results did not show evidence for auditory-visual integration among native participants, while visual information was helpful for tone-naïve participants. However, even for this group, visual information only marginally increased the accuracy in the tone identification task, and this increase depended on the tone in question. Chapter 5 is another chapter that zooms in on the relative strength of auditory and visual information for tone-naïve perceivers, but from the aspect of tone classification. In this chapter, we studied the acoustic and visual features of the tones produced by native speakers of Mandarin Chinese. Computational models based on acoustic features, visual features and acoustic-visual features were constructed to automatically classify Mandarin tones. Moreover, this study examined what perceivers pick up (perception) from what a speaker does (production, facial expression) by studying both production and perception. To be more specific, this chapter set out to answer: (1) which acoustic and visual features of tones produced by native speakers could be used to automatically classify Mandarin tones. Furthermore, (2) whether or not the features used in tone production are similar to or different from the ones that have cue value for tone-naïve perceivers when they categorize tones; and (3) whether and how visual information (i.e., facial expression and facial pose) contributes to the classification of Mandarin tones over and above the information provided by the acoustic signal. To address these questions, the stimuli that had been recorded (and described in chapter 2) and the response data that had been collected (and reported on in chapter 3) were used. Basic acoustic and visual features were extracted. Based on them, we used Random Forest classification to identify the most important acoustic and visual features for classifying the tones. The classifiers were trained on produced tone classification (given a set of auditory and visual features, predict the produced tone) and on perceived/responded tone classification (given a set of features, predict the corresponding tone as identified by the participant). The results showed that acoustic features outperformed visual features for tone classification, both for the classification of the produced and the perceived tone. However, tone-naïve perceivers did revert to the use of visual information in certain cases (when they gave wrong responses). So, visual information does not seem to play a significant role in native speakers’ tone production, but tone-naïve perceivers do sometimes consider visual information in their tone identification. These findings provided additional evidence that auditory information is more important than visual information in Mandarin tone perception and tone classification. Notably, visual features contributed to the participants’ erroneous performance. This suggests that visual information actually misled tone-naïve perceivers in their task of tone identification. To some extent, this is consistent with our claim that visual cues do influence tone perception. In addition, the ranking of the auditory features and visual features in tone perception showed that the factor perceiver (i.e., the participant) was responsible for the largest amount of variance explained in the responses by our tone-naïve participants, indicating the importance of individual differences in tone perception. To sum up, perceivers who do not have tone in their language background tend to make use of visual cues from the speakers’ faces for their perception of unknown tones (Mandarin Chinese in this dissertation), in addition to the auditory information they clearly also use. However, auditory cues are still the primary source they rely on. There is a consistent finding across the studies that the variations between tones, speakers and participants have an effect on the accuracy of tone identification for tone-naïve speaker

    Magnitude sound symbolism influences vowel production

    Get PDF
    Segmental properties of speech can convey sound symbolic meaning. This study presents two novel sound meaning mappings using a choice reaction time paradigm in which participants have to select quickly one of the two vocal response alternatives based on predefined categories of perceptual magnitude. The first study showed that the short distance between perceived objects facilitates the initiation of the vowel [i] production, while long distance facilitates the production of [u] and [?]. Correspondingly, in the second study, vocal responses produced with [i] and [e] were initiated faster when the stimuli required short vocalizations, while responses produced with [u], [?] and [y] were faster when the stimuli required long vocalizations. Hence, similar sound-meaning mappings were observed concerning concepts of spatial and temporal length. This suggests that different sound-magnitude effects can be generalized to the common processing of conceptual magnitude. A conceptual magnitude seems to be implicitly and systematically associated with an articulatory response of a specific vowel. The study also suggests that in addition to the vowel openness and backness, the vowel roundness can also associate particular vowels with large magnitudes.Peer reviewe

    The development of audiovisual vowel processing in monolingual and bilingual infants: a cross-sectional and longitudinal study.

    Get PDF
    127 p.The aim of the current dissertation is to investigate to what extent infants acquiring one language (monolinguals) and infants acquiring two languages (bilinguals) share their strategies during audiovisual speech processing. The dissertation focuses on typically developing Basque and Spanish monolingual and bilingual infants' processing of matching and mismatching audio-visual vowels at 4 and 8 months of age. Using an eye-tracker, the infants' attention to audiovisual match versus mismatch conditions and to the speakers' eyes versus mouth was measured in a cross-sectional and a longitudinal design. The cross-sectional data revealed that bilingual and monolingual infants exhibited similar audiovisual matching ability. Furthermore, they exhibited similar looking pattern: at 4 months of age, monolinguals and bilinguals attended more to the speakers' eyes, whereas at 8 months of age they attended equally to the eyes and to the mouth. Finally, the longitudinal data revealed that infants' attention to the eyes versus the mouth is correlated between 4 and 8 months of age, regardless of the linguistic group. Taken together, the current research demonstrated no clear difference in audiovisual vowel processing between monolingual and bilingual infants. Overall, the dissertation has made fundamental contributions to understanding underlying processes in language acquisition across linguistically diverse populations.bcbl: basque center on cognition, brain and languag

    The development of audiovisual vowel processing in monolingual and bilingual infants: a cross-sectional and longitudinal study.

    Get PDF
    127 p.The aim of the current dissertation is to investigate to what extent infants acquiring one language (monolinguals) and infants acquiring two languages (bilinguals) share their strategies during audiovisual speech processing. The dissertation focuses on typically developing Basque and Spanish monolingual and bilingual infants' processing of matching and mismatching audio-visual vowels at 4 and 8 months of age. Using an eye-tracker, the infants' attention to audiovisual match versus mismatch conditions and to the speakers' eyes versus mouth was measured in a cross-sectional and a longitudinal design. The cross-sectional data revealed that bilingual and monolingual infants exhibited similar audiovisual matching ability. Furthermore, they exhibited similar looking pattern: at 4 months of age, monolinguals and bilinguals attended more to the speakers' eyes, whereas at 8 months of age they attended equally to the eyes and to the mouth. Finally, the longitudinal data revealed that infants' attention to the eyes versus the mouth is correlated between 4 and 8 months of age, regardless of the linguistic group. Taken together, the current research demonstrated no clear difference in audiovisual vowel processing between monolingual and bilingual infants. Overall, the dissertation has made fundamental contributions to understanding underlying processes in language acquisition across linguistically diverse populations.bcbl: basque center on cognition, brain and languag
    corecore