1,434 research outputs found

    Augmenting Situated Spoken Language Interaction with Listener Gaze

    Get PDF
    Collaborative task solving in a shared environment requires referential success. Human speakers follow the listener’s behavior in order to monitor language comprehension (Clark, 1996). Furthermore, a natural language generation (NLG) system can exploit listener gaze to realize an effective interaction strategy by responding to it with verbal feedback in virtual environments (Garoufi, Staudte, Koller, & Crocker, 2016). We augment situated spoken language interaction with listener gaze and investigate its role in human-human and human-machine interactions. Firstly, we evaluate its impact on prediction of reference resolution using a mulitimodal corpus collection from virtual environments. Secondly, we explore if and how a human speaker uses listener gaze in an indoor guidance task, while spontaneously referring to real-world objects in a real environment. Thirdly, we consider an object identification task for assembly under system instruction. We developed a multimodal interactive system and two NLG systems that integrate listener gaze in the generation mechanisms. The NLG system “Feedback” reacts to gaze with verbal feedback, either underspecified or contrastive. The NLG system “Installments” uses gaze to incrementally refer to an object in the form of installments. Our results showed that gaze features improved the accuracy of automatic prediction of reference resolution. Further, we found that human speakers are very good at producing referring expressions, and showing listener gaze did not improve performance, but elicited more negative feedback. In contrast, we showed that an NLG system that exploits listener gaze benefits the listener’s understanding. Specifically, combining a short, ambiguous instruction with con- trastive feedback resulted in faster interactions compared to underspecified feedback, and even outperformed following long, unambiguous instructions. Moreover, alternating the underspecified and contrastive responses in an interleaved manner led to better engagement with the system and an effcient information uptake, and resulted in equally good performance. Somewhat surprisingly, when gaze was incorporated more indirectly in the generation procedure and used to trigger installments, the non-interactive approach that outputs an instruction all at once was more effective. However, if the spatial expression was mentioned first, referring in gaze-driven installments was as efficient as following an exhaustive instruction. In sum, we provide a proof of concept that listener gaze can effectively be used in situated human-machine interaction. An assistance system using gaze cues is more attentive and adapts to listener behavior to ensure communicative success

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    The brain is a prediction machine that cares about good and bad - Any implications for neuropragmatics?

    Get PDF
    Experimental pragmatics asks how people construct contextualized meaning in communication. So what does it mean for this field to add neuroas a prefix to its name? After analyzing the options for any subfield of cognitive science, I argue that neuropragmatics can and occasionally should go beyond the instrumental use of EEG or fMRI and beyond mapping classic theoretical distinctions onto Brodmann areas. In particular, if experimental pragmatics ‘goes neuro’, it should take into account that the brain evolved as a control system that helps its bearer negotiate a highly complex, rapidly changing and often not so friendly environment. In this context, the ability to predict current unknowns, and to rapidly tell good from bad, are essential ingredients of processing. Using insights from non-linguistic areas of cognitive neuroscience as well as from EEG research on utterance comprehension, I argue that for a balanced development of experimental pragmatics, these two characteristics of the brain cannot be ignored

    MOG 2007:Workshop on Multimodal Output Generation: CTIT Proceedings

    Get PDF
    This volume brings together presents a wide variety of work offering different perspectives on multimodal generation. Two different strands of work can be distinguished: half of the gathered papers present current work on embodied conversational agents (ECA’s), while the other half presents current work on multimedia applications. Two general research questions are shared by all: what output modalities are most suitable in which situation, and how should different output modalities be combined

    A Framework For Abstracting, Designing And Building Tangible Gesture Interactive Systems

    Get PDF
    This thesis discusses tangible gesture interaction, a novel paradigm for interacting with computer that blends concepts from the more popular fields of tangible interaction and gesture interaction. Taking advantage of the human innate abilities to manipulate physical objects and to communicate through gestures, tangible gesture interaction is particularly interesting for interacting in smart environments, bringing the interaction with computer beyond the screen, back to the real world. Since tangible gesture interaction is a relatively new field of research, this thesis presents a conceptual framework that aims at supporting future work in this field. The Tangible Gesture Interaction Framework provides support on three levels. First, it helps reflecting from a theoretical point of view on the different types of tangible gestures that can be designed, physically, through a taxonomy based on three components (move, hold and touch) and additional attributes, and semantically, through a taxonomy of the semantic constructs that can be used to associate meaning to tangible gestures. Second, it helps conceiving new tangible gesture interactive systems and designing new interactions based on gestures with objects, through dedicated guidelines for tangible gesture definition and common practices for different application domains. Third, it helps building new tangible gesture interactive systems supporting the choice between four different technological approaches (embedded and embodied, wearable, environmental or hybrid) and providing general guidance for the different approaches. As an application of this framework, this thesis presents also seven tangible gesture interactive systems for three different application domains, i.e., interacting with the In-Vehicle Infotainment System (IVIS) of the car, the emotional and interpersonal communication, and the interaction in a smart home. For the first application domain, four different systems that use gestures on the steering wheel as interaction means with the IVIS have been designed, developed and evaluated. For the second application domain, an anthropomorphic lamp able to recognize gestures that humans typically perform for interpersonal communication has been conceived and developed. A second system, based on smart t-shirts, recognizes when two people hug and reward the gesture with an exchange of digital information. Finally, a smart watch for recognizing gestures performed with objects held in the hand in the context of the smart home has been investigated. The analysis of existing systems found in literature and of the system developed during this thesis shows that the framework has a good descriptive and evaluative power. The applications developed during this thesis show that the proposed framework has also a good generative power.Questa tesi discute l’interazione gestuale tangibile, un nuovo paradigma per interagire con il computer che unisce i principi dei più comuni campi di studio dell’interazione tangibile e dell’interazione gestuale. Sfruttando le abilità innate dell’uomo di manipolare oggetti fisici e di comunicare con i gesti, l’interazione gestuale tangibile si rivela particolarmente interessante per interagire negli ambienti intelligenti, riportando l’attenzione sul nostro mondo reale, al di là dello schermo dei computer o degli smartphone. Poiché l’interazione gestuale tangibile è un campo di studio relativamente recente, questa tesi presenta un framework (quadro teorico) che ha lo scopo di assistere lavori futuri in questo campo. Il Framework per l’Interazione Gestuale Tangibile fornisce supporto su tre livelli. Per prima cosa, aiuta a riflettere da un punto di vista teorico sui diversi tipi di gesti tangibili che possono essere eseguiti fisicamente, grazie a una tassonomia basata su tre componenti (muovere, tenere, toccare) e attributi addizionali, e che possono essere concepiti semanticamente, grazie a una tassonomia di tutti i costrutti semantici che permettono di associare dei significati ai gesti tangibili. In secondo luogo, il framework proposto aiuta a concepire nuovi sistemi interattivi basati su gesti tangibili e a ideare nuove interazioni basate su gesti con gli oggetti, attraverso linee guida per la definizione di gesti tangibili e una selezione delle migliore pratiche per i differenti campi di applicazione. Infine, il framework aiuta a implementare nuovi sistemi interattivi basati su gesti tangibili, permettendo di scegliere tra quattro differenti approcci tecnologici (incarnato e integrato negli oggetti, indossabile, distribuito nell’ambiente, o ibrido) e fornendo una guida generale per la scelta tra questi differenti approcci. Come applicazione di questo framework, questa tesi presenta anche sette sistemi interattivi basati su gesti tangibili, realizzati per tre differenti campi di applicazione: l’interazione con i sistemi di infotainment degli autoveicoli, la comunicazione interpersonale delle emozioni, e l’interazione nella casa intelligente. Per il primo campo di applicazione, sono stati progettati, sviluppati e testati quattro differenti sistemi che usano gesti tangibili effettuati sul volante come modalità di interazione con il sistema di infotainment. Per il secondo campo di applicazione, è stata concepita e sviluppata una lampada antropomorfica in grado di riconoscere i gesti tipici dell’interazione interpersonale. Per lo stesso campo di applicazione, un secondo sistema, basato su una maglietta intelligente, riconosce quando due persone si abbracciano e ricompensa questo gesto con uno scambio di informazioni digitali. Infine, per l’interazione nella casa intelligente, è stata investigata la realizzazione di uno smart watch per il riconoscimento di gesti eseguiti con oggetti tenuti nella mano. L’analisi dei sistemi interattivi esistenti basati su gesti tangibili permette di dimostrare che il framework ha un buon potere descrittivo e valutativo. Le applicazioni sviluppate durante la tesi mostrano che il framework proposto ha anche un valido potere generativo

    References to graphical objects in interactive multimodel queries

    Get PDF
    This thesis describes a computational model for interpreting natural language expressions in an interactive multimodal query system integrating both natural language text and graphic displays. The primary concern of the model is to interpret expressions that might involve graphical attributes, and expressions whose referents could be objects on the screen.Graphical objects on the screen are used to visualise entities in the application domain and their attributes (in short, domain entities and domain attributes). This is why graphical objects are treated as descriptions of those domain entities/attributes in the literature. However, graphical objects and their attributes are visible during the interaction, and are thus known by the participants of the interaction. Therefore, they themselves should be part of the mutual knowledge of the interaction.This poses some interesting problems in language processing. As part of the mutual knowledge, graphical attributes could be used in expressions, and graphical objects could be referred to by expressions. In consequence, there could be ambiguities about whether an attribute in an expression belongs to a graphical object or to a domain entity. There could also be ambiguities about whether the referent of an expression is a graphical object or a domain entity.The main contributions of this thesis consist of analysing the above ambiguities, de¬ signing, implementing and testing a computational model and a demonstration system for resolving these ambiguities. Firstly, a structure and corresponding terminology are set up, so these ambiguities can be clarified as ambiguities derived from referring to different databases, the screen or the application domain (source ambiguities). Secondly, a meaning representation language is designed which explicitly represents the information about which database an attribute/entity comes from. Several linguistic regularities inside and among referring expressions are described so that they can be used as heuristics in the ambiguity resolution. Thirdly, a computational model based on constraint satisfaction is constructed to resolve simultaneously some reference ambiguities and source ambiguities. Then, a demonstration system integrating natural language text and graphics is implemented, whose core is the computational model.This thesis ends with an evaluation of the computational model. It provides some concrete evidence about the advantages and disadvantages of the above approach

    Bootstrapping Information from Corpora in a Cross-Linguistic Perspective

    Get PDF
    The achievements of Romance language corpus-driven studies deserve more attention from the scientific community at the world level for both their quantity and quality. This book contains papers given at the 3rd International LABLITA Workshop in Corpus Linguistics (Italian Department, University of Florence, June 4th-5th 2008 ), and it aims at integrating new ideas and results derived from Romance language corpora in the framework of the overall achievements of Corpus Linguistics. The volume contains the contribution of a leading scholar of Corpus Linguistics (Douglas Biber), and a set of articles presented to Biber by notable European researchers and those from other countries. Papers report on long-term studies ranging from Italian to Spanish, French, Brazilian Portuguese and Japanese
    corecore