475 research outputs found

    Alignment of speech and co-speech gesture in a constraint-based grammar

    Get PDF
    This thesis concerns the form-meaning mapping of multimodal communicative actions consisting of speech signals and improvised co-speech gestures, produced spontaneously with the hand. The interaction between speech and speech-accompanying gestures has been standardly addressed from a cognitive perspective to establish the underlying cognitive mechanisms for the synchronous speech and gesture production, and also from a computational perspective to build computer systems that communicate through multiple modalities. Based on the findings of this previous research, we advance a new theory in which the mapping from the form of the combined speech-and-gesture signal to its meaning is analysed in a constraint-based multimodal grammar. We propose several construction rules about multimodal well-formedness that we motivate empirically from an extensive and detailed corpus study. In particular, the construction rules use the prosody, syntax and semantics of speech, the form and meaning of the gesture signal, as well as the temporal performance of the speech relative to the temporal performance of the gesture to constrain the derivation of a single multimodal syntax tree which in turn determines a meaning representation via standard mechanisms for semantic composition. Gestural form often underspecifies its meaning, and so the output of our grammar is underspecified logical formulae that support the range of possible interpretations of the multimodal act in its final context-of-use, given the current models of the semantics/ pragmatics interface. It is standardly held in the gesture community that the co-expressivity of speech and gesture is determined on the basis of their temporal co-occurrence: that is, a gesture signal is semantically related to the speech signal that happened at the same time as the gesture. Whereas this is usually taken for granted, we propose a methodology of establishing in a systematic and domain-independent way which spoken element(s) gesture can be semantically related to, based on their form, so as to yield a meaning representation that supports the intended interpretation(s) in context. The ‘semantic’ alignment of speech and gesture is thus driven not from the temporal co-occurrence alone, but also from the linguistic properties of the speech signal gesture overlaps with. In so doing, we contribute a fine-grained system for articulating the form-meaning mapping of multimodal actions that uses standard methods from linguistics. We show that just as language exhibits ambiguity in both form and meaning, so do multimodal actions: for instance, the integration of gesture is not restricted to a unique speech phrase but rather speech and gesture can be aligned in multiple multimodal syntax trees thus yielding distinct meaning representations. These multiple mappings stem from the fact that the meaning as derived from gesture form is highly incomplete even in context. An overall challenge is thus to account for the range of possible interpretations of the multimodal action in context using standard methods from linguistics for syntactic derivation and semantic composition

    Lexical and postlexical prominence in Tashlhiyt Berber and Moroccan Arabic

    Get PDF
    Tashlhiyt Berber (Afro-Asiatic, Berber) and Moroccan Arabic (Afro-Asiatic, Semitic), two languages spoken in Morocco, have been in contact for over 1200 years. The influence of Berber languages on the lexicon and the segmental-phonological structure of Moroccan Arabic is well-documented, whereas possible similarities in the prosodic-phonological domain have not yet been addressed in detail. This thesis brings together evidence from production and perception to bear on the question whether Tashlhiyt Berber and Moroccan Arabic also exhibit convergence in the domain of phonological prominence. Experimental results are interpreted as showing that neither language has lexical prominence asymmetries in the form of lexical stress. This lack of stress in Moroccan Arabic is unlike the undisputed presence of lexical stress in most other varieties of Arabic, which in turn suggests that this aspect of the phonology of Moroccan Arabic has resulted from contact with (Tashlhiyt) Berber. A further, theoretical contribution is made with respect to the possible correspondence between lexical and postlexical prominence structure from a typological point of view. One of the tenets of the Autosegmental Metrical approach to intonation analysis holds that prominence-marking intonational events (pitch accents) associate with lexically stressed syllables. Exactly how prominence marking is achieved in languages that lack lexical stress is little-understood, and this thesis' discussion of postlexical prominence in Tashlhiyt Berber and Moroccan Arabic provides new insights that bear on this topic. A first set of production experiments investigates, for both languages, if there are acoustic correlates to what some researchers have considered to be lexically stressed syllables. It is shown that neither language exhibits consistent acoustic enhancement of presumed stressed syllables relative to unstressed syllables. The second set of production experiments reports on the prosodic characteristics of question word interrogatives in both languages. It is shown that question words are the locus of postlexical prominence-marking events that however do not exhibit association to a sub-lexical phonological unit. A final perception experiment serves the goal of showing how native speakers of Tashlhiyt Berber and Moroccan Arabic deal with the encoding of a postlexical prominence contrast that is parasitic on a lexical prominence contrast. This is achieved by means of a 'stress deafness' experiment, the results of which show that speakers of neither language can reliably encode a lexically-specified prominence difference. Results from all three types of experiment thus converge in suggesting that lexical prominence asymmetries are not specified in the phonology of either language

    The Way You Hear It, the Way You Judge It: Moral Decision-making and Moral Reasoning in Accented Speech

    Get PDF
    The previous studies have shown that people make different decisions not only after reading and also listening to moral dilemmas in a foreign language (L2) than in a native language (L1). This effect is named Moral Foreign Language Effect (MFLE). Emotion, which is considered to play a pivotal role in moral judgments, is also found to have a close interaction with sounds. The current research aims to (1) investigate whether the sound of different languages (i.e. accents) can also trigger the MFLE in listeners’ moral decision-making and (2) examine the foreign accent effect on listeners’ moral reasoning pattern. Chinese ESL college students were recruited as listeners of Chinese-accented and English-accented speech of moral dilemmas in Mandarin and English. However, although the study revealed a potential foreign accent effect on moral reasoning patterns in native-accented Chinese and foreign-accented Chinese, contradicting our predictions, a foreign accent effect on moral decisions and moral reasoning patterns was not detected. Neither the higher proficiency in L2 was found associated with moral reasoning patterns employed in L2-sounding speech. With potential explanations of the results, and future improvements and research directions in moral psychology are also discussed

    The Processing of Emotional Sentences by Young and Older Adults: A Visual World Eye-movement Study

    Get PDF
    Carminati MN, Knoeferle P. The Processing of Emotional Sentences by Young and Older Adults: A Visual World Eye-movement Study. Presented at the Architectures and Mechanisms of Language and Processing (AMLaP), Riva del Garda, Italy

    Proceedings of the VIIth GSCP International Conference

    Get PDF
    The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded)

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Advances in the neurocognition of music and language

    Get PDF

    The impact of recent and long-term experience on access to word meanings: Evidence from large-scale internet-based experiments

    Get PDF
    Many word forms map onto multiple meanings (e.g., ‘‘ace”). The current experiments explore the extent to which adults reshape the lexical–semantic representations of such words on the basis of experience, to increase the availability of more recently accessed meanings. A naturalistic web-based experiment in which primes were presented within a radio programme (Experiment 1; N = 1800) and a lab-based experiment (Experiment 2) show that when listeners have encountered one or two disambiguated instances of an ambiguous word, they then retrieve this primed meaning more often (compared with an unprimed control condition). This word-meaning priming lasts up to 40 min after exposure, but decays very rapidly during this interval. Experiments 3 and 4 explore longer term word-meaning priming by measuring the impact of more extended, naturalistic encounters with ambiguous words: recreational rowers (N = 213) retrieved rowing related meanings for words (e.g., ‘‘feather”) more often if they had rowed that day, despite a median delay of 8 hours. The rate of rowing-related interpretations also increased with additional years’ rowing experience. Taken together these experiments show that individuals’ overall meaning preferences reflect experience across a wide range of timescales from minutes to years. In addition, priming was not reduced by a change in speaker identity (Experiment 1), suggesting that the phenomenon occurs at a relatively abstract lexical–semantic level. The impact of experience was reduced for older adults (Experiments 1, 3, 4) suggesting that the lexical–semantic representations of younger listeners may be more malleable to current linguistic experience

    Language In My Mouth: Linguistic Variation in the Nmbo Speech Community of Southern New Guinea

    Get PDF
    This thesis is a mixed-methods investigation into the question of the sociolinguistics of linguistic diversity in Papua New Guinea. Social and cultural traits of New Guinean speech communities have been hypothesised as conducive to language differentiation and diversification (Laycock 1991, Thurston 1987, 1992, Foley 2000, Ross 2001), however there have been few empirical studies to support these hypotheses. In this thesis I investigate linguistic micro-variations within a contemporary New Guinean speech community, with the goal of identifying socio-cultural pressures that affect language variation and change. The community under investigation is the Nmbo speech community located in the Morehead area of Southern New Guinea. It is a highly multilingual community in the middle of the Nambu branch dialect chain, and consists primarily of the three villages Govav, Bevdvn, and Arovwe. The ideologically licensed speakers of Nmbo are the Kerake tribe people, but due to the practice of marriage exogamy, a large portion of non-Kerake people speak Nmbo as an additional language learnt from their parents or spouse. This thesis embraces the complexities of the multilingual ecology by including data from Kerake women who have married out of the Nmbo villages into the neighbouring Nen language village of Bimadbn. The empirical investigations bring data from three directions. First are the qualitative descriptions based on my own ethnographic fieldwork supported by prior ethnographic descriptions. The picture to emerge is of an egalitarian multilingual speech community. The qualitative descriptions also provide basic facts about demographics and social structures of the community. Second is the linguistic description of the Nmbo language. Nmbo is an under-described language without substantial prior description, and this thesis contains a sketch grammar covering the basics aspects of Nmbo grammar. Finally there are three quantitative studies of variation. The vowel sociophonetic study and the word initial [h]-drop study are classic Labovian variationist studies that investigate patterns of variation across a sample of speakers. The former is based of elicited word list data, and the latter on naturalistic speech data. The third quantitative study takes a grammaticalisation approach to an emergent topic marker in a topicalising construction from a relative clause construction. This is the first thesis ever produced providing qualitative, descriptive, and quantitative data from a New Guinean speech community within a language ecology of vital indigenous multilingualism. The contributions of the thesis are two fold. Firstly, this thesis brings grammatical and sociolinguistic descriptions from an under-studied language. It is a socio-grammar (Nagy 2009) that considers language ecology, sociolinguistics, and grammatical description. Secondly, this thesis contributes empirical data on the sociolinguistics of small-scale speech communities. The classic sociolinguistic variable of gender is not found to be particularly significant in the variables studied, despite the community being highly gendered in other social domains. Village, however, shows some significance. As far as the three variables are concerned, Nmbo speakers show little community-internal variation and paint a picture of a tight-knit society of intimates (Trudgill 2011). The conclusion to the question of the sociolinguistics of diversification is that while there is some evidence of sociolinguistic differentiation within the Nmbo speech community, the most important social groups to orient against are the other sister language groups in the Morehead area. The nascent variation within the Nmbo speech community, combined with the ethnographic evidence of a cluster of dense and multiplex social networks, suggest that should the social need to differentiate between other Kerake arise, linguistic differentiation may occur rapidly
    corecore