964 research outputs found

    The virtual guide: a direction giving embodied conversational agent

    Get PDF
    We present the Virtual Guide, an embodied conversational agent that can give directions in a 3D virtual environment. We discuss how dialogue management, language generation and the generation of appropriate gestures are carried out in our system

    Language is a complex adaptive system

    Get PDF
    The ASLAN labex - Advanced studies on language complexity - brings together a unique set of expertise and varied points of view on language. In this volume, we employ three main sections showcasing diverse empirical work to illustrate how language within human interaction is a complex and adaptive system. The first section – epistemological views on complexity – pleads for epistemological plurality, an end to dichotomies, and proposes different ways to connect and translate between frameworks. The second section – complexity, pragmatics and discourse – focuses on discourse practices at different levels of description. Other semiotic systems, in addition to language are mobilized, but also interlocutors’ perception, memory and understanding of culture. The third section – complexity, interaction, and multimodality – employs different disciplinary frameworks to weave between micro, meso, and macro levels of analyses. Our specific contributions include adding elements to and extending the field of application of the models proposed by others through new examples of emergence, interplay of heterogeneous elements, intrinsic diversity, feedback, novelty, self-organization, adaptation, multi-dimensionality, indeterminism, and collective control with distributed emergence. Finally, we argue for a change in vantage point regarding the search for linguistic universals

    Gesture and Speech in Interaction - 4th edition (GESPIN 4)

    Get PDF
    International audienceThe fourth edition of Gesture and Speech in Interaction (GESPIN) was held in Nantes, France. With more than 40 papers, these proceedings show just what a flourishing field of enquiry gesture studies continues to be. The keynote speeches of the conference addressed three different aspects of multimodal interaction:gesture and grammar, gesture acquisition, and gesture and social interaction. In a talk entitled Qualitiesof event construal in speech and gesture: Aspect and tense, Alan Cienki presented an ongoing researchproject on narratives in French, German and Russian, a project that focuses especially on the verbal andgestural expression of grammatical tense and aspect in narratives in the three languages. Jean-MarcColletta's talk, entitled Gesture and Language Development: towards a unified theoretical framework,described the joint acquisition and development of speech and early conventional and representationalgestures. In Grammar, deixis, and multimodality between code-manifestation and code-integration or whyKendon's Continuum should be transformed into a gestural circle, Ellen Fricke proposed a revisitedgrammar of noun phrases that integrates gestures as part of the semiotic and typological codes of individuallanguages. From a pragmatic and cognitive perspective, Judith Holler explored the use ofgaze and hand gestures as means of organizing turns at talk as well as establishing common ground in apresentation entitled On the pragmatics of multi-modal face-to-face communication: Gesture, speech andgaze in the coordination of mental states and social interaction.Among the talks and posters presented at the conference, the vast majority of topics related, quitenaturally, to gesture and speech in interaction - understood both in terms of mapping of units in differentsemiotic modes and of the use of gesture and speech in social interaction. Several presentations explored the effects of impairments(such as diseases or the natural ageing process) on gesture and speech. The communicative relevance ofgesture and speech and audience-design in natural interactions, as well as in more controlled settings liketelevision debates and reports, was another topic addressed during the conference. Some participantsalso presented research on first and second language learning, while others discussed the relationshipbetween gesture and intonation. While most participants presented research on gesture and speech froman observer's perspective, be it in semiotics or pragmatics, some nevertheless focused on another importantaspect: the cognitive processes involved in language production and perception. Last but not least,participants also presented talks and posters on the computational analysis of gestures, whether involvingexternal devices (e.g. mocap, kinect) or concerning the use of specially-designed computer software forthe post-treatment of gestural data. Importantly, new links were made between semiotics and mocap data

    Language is a complex adaptive system

    Get PDF
    The ASLAN labex - Advanced studies on language complexity - brings together a unique set of expertise and varied points of view on language. In this volume, we employ three main sections showcasing diverse empirical work to illustrate how language within human interaction is a complex and adaptive system. The first section – epistemological views on complexity – pleads for epistemological plurality, an end to dichotomies, and proposes different ways to connect and translate between frameworks. The second section – complexity, pragmatics and discourse – focuses on discourse practices at different levels of description. Other semiotic systems, in addition to language are mobilized, but also interlocutors’ perception, memory and understanding of culture. The third section – complexity, interaction, and multimodality – employs different disciplinary frameworks to weave between micro, meso, and macro levels of analyses. Our specific contributions include adding elements to and extending the field of application of the models proposed by others through new examples of emergence, interplay of heterogeneous elements, intrinsic diversity, feedback, novelty, self-organization, adaptation, multi-dimensionality, indeterminism, and collective control with distributed emergence. Finally, we argue for a change in vantage point regarding the search for linguistic universals

    Directional adposition use in English, Swedish and Finnish

    Get PDF
    Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellĂ€ (in front of) and jĂ€ljessĂ€ (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003). When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellĂ€ (in front of) and jĂ€ljessĂ€ (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected. We asked native English, Swedish and Finnish speakers’ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers. All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion. We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion. Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press. Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press. Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo

    A multimodal perspective on modality in the English language classroom

    Get PDF
    This thesis is a study of an engage, study, activate (ESA) lesson of teaching modals of present deduction. The lesson has been taken from a published English language teaching course book and is typical of the way modal forms are presented to teach epistemic modality in many commercially produced English language teaching course books. I argue that for cognitive, social, linguistic and procedural reasons the linguistic forms and structures presented in the lesson are not straightforwardly transferred to the activate stage of the lesson. Using insights from spoken language corpora I carry out a comparative analysis with the modal forms presented in the course book. I then explore the notion of ‘context’ and drawing on systemic functional grammar discuss how modal forms function in discourse to realise interpersonal relations. Moving my research to the English language classroom I collect ethnographic classroom data and using social semiotic multimodality as an analytical framework I explore learner interaction to uncover the communicative resources learners use to express epistemic modality in a discussion activity from the same lesson. My analysis reveals that the modal structures in the course book differ to some extent from spoken language corpora. It shows that the course book offers no instruction on the interpersonal dimension of modality and thus how speakers use signals of modality to position themselves interpersonally vis-à-vis their interlocutors. The data collected from the English language class reveals that during the lesson learners communicate modality through modes of communication such as eye gaze, gesture and posture in addition to spoken language. Again drawing from systemic functional grammar I explain how these modes have the potential to express interpersonal meaning and thus highlight that meaning is communicated through modal ensembles. Based on these findings I propose a number of teaching strategies to raise awareness of the interpersonal function of modality in multimodal discourse, and for the use of language corpora to better inform teaching materials on selections of modality

    Integrating Gestures

    Get PDF
    Gestures convey information about culture, discourse, thought, intentionality, emotion, intersubjectivity, cognition, and first and second language acquisition. Additionally, they are used by non-human primates to communicate with their peers and with humans. Consequently, the modern field of gesture studies has attracted researchers from a number of different disciplines such as anthropology, cognitive science, communication, neuroscience, psycholinguistics, primatology, psychology, robotics, sociology and semiotics. This volume presents an overview of the depth and breadth of current research in gesture. Its focus is on the interdisciplinary nature of gesture. The chapters included in the volume are divided into six themes: the nature and functions of gesture, first language development and gesture, second language effects on gesture, gesture in the classroom and in problem solving, gesture aspects of discourse and interaction, and gestural analysis of music and dance

    Structuring information through gesture and intonation

    Get PDF
    Face-to-face communication is multimodal. In unscripted spoken discourse we can observe the interaction of several “semiotic layers”, modalities of information such as syntax, discourse structure, gesture, and intonation. We explore the role of gesture and intonation in structuring and aligning information in spoken discourse through a study of the co-occurrence of pitch accents and gestural apices. Metaphorical spatialization through gesture also plays a role in conveying the contextual relationships between the speaker, the government and other external forces in a naturally-occurring political speech setting

    A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

    Full text link
    Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.Comment: Accepted for EUROGRAPHICS 202

    MULTI-MODAL TASK INSTRUCTIONS TO ROBOTS BY NAIVE USERS

    Get PDF
    This thesis presents a theoretical framework for the design of user-programmable robots. The objective of the work is to investigate multi-modal unconstrained natural instructions given to robots in order to design a learning robot. A corpus-centred approach is used to design an agent that can reason, learn and interact with a human in a natural unconstrained way. The corpus-centred design approach is formalised and developed in detail. It requires the developer to record a human during interaction and analyse the recordings to find instruction primitives. These are then implemented into a robot. The focus of this work has been on how to combine speech and gesture using rules extracted from the analysis of a corpus. A multi-modal integration algorithm is presented, that can use timing and semantics to group, match and unify gesture and language. The algorithm always achieves correct pairings on a corpus and initiates questions to the user in ambiguous cases or missing information. The domain of card games has been investigated, because of its variety of games which are rich in rules and contain sequences. A further focus of the work is on the translation of rule-based instructions. Most multi-modal interfaces to date have only considered sequential instructions. The combination of frame-based reasoning, a knowledge base organised as an ontology and a problem solver engine is used to store these rules. The understanding of rule instructions, which contain conditional and imaginary situations require an agent with complex reasoning capabilities. A test system of the agent implementation is also described. Tests to confirm the implementation by playing back the corpus are presented. Furthermore, deployment test results with the implemented agent and human subjects are presented and discussed. The tests showed that the rate of errors that are due to the sentences not being defined in the grammar does not decrease by an acceptable rate when new grammar is introduced. This was particularly the case for complex verbal rule instructions which have a large variety of being expressed
    • 

    corecore