1,053 research outputs found

    Do (and say) as I say: Linguistic adaptation in human-computer dialogs

    Get PDF
    © Theodora Koulouri, Stanislao Lauria, and Robert D. Macredie. This article has been made available through the Brunel Open Access Publishing Fund.There is strong research evidence showing that people naturally align to each other’s vocabulary, sentence structure, and acoustic features in dialog, yet little is known about how the alignment mechanism operates in the interaction between users and computer systems let alone how it may be exploited to improve the efficiency of the interaction. This article provides an account of lexical alignment in human–computer dialogs, based on empirical data collected in a simulated human–computer interaction scenario. The results indicate that alignment is present, resulting in the gradual reduction and stabilization of the vocabulary-in-use, and that it is also reciprocal. Further, the results suggest that when system and user errors occur, the development of alignment is temporarily disrupted and users tend to introduce novel words to the dialog. The results also indicate that alignment in human–computer interaction may have a strong strategic component and is used as a resource to compensate for less optimal (visually impoverished) interaction conditions. Moreover, lower alignment is associated with less successful interaction, as measured by user perceptions. The article distills the results of the study into design recommendations for human–computer dialog systems and uses them to outline a model of dialog management that supports and exploits alignment through mechanisms for in-use adaptation of the system’s grammar and lexicon

    Predicting Causes of Reformulation in Intelligent Assistants

    Full text link
    Intelligent assistants (IAs) such as Siri and Cortana conversationally interact with users and execute a wide range of actions (e.g., searching the Web, setting alarms, and chatting). IAs can support these actions through the combination of various components such as automatic speech recognition, natural language understanding, and language generation. However, the complexity of these components hinders developers from determining which component causes an error. To remove this hindrance, we focus on reformulation, which is a useful signal of user dissatisfaction, and propose a method to predict the reformulation causes. We evaluate the method using the user logs of a commercial IA. The experimental results have demonstrated that features designed to detect the error of a specific component improve the performance of reformulation cause detection.Comment: 11 pages, 2 figures, accepted as a long paper for SIGDIAL 201

    Sympathy Begins with a Smile, Intelligence Begins with a Word: Use of Multimodal Features in Spoken Human-Robot Interaction

    Full text link
    Recognition of social signals, from human facial expressions or prosody of speech, is a popular research topic in human-robot interaction studies. There is also a long line of research in the spoken dialogue community that investigates user satisfaction in relation to dialogue characteristics. However, very little research relates a combination of multimodal social signals and language features detected during spoken face-to-face human-robot interaction to the resulting user perception of a robot. In this paper we show how different emotional facial expressions of human users, in combination with prosodic characteristics of human speech and features of human-robot dialogue, correlate with users' impressions of the robot after a conversation. We find that happiness in the user's recognised facial expression strongly correlates with likeability of a robot, while dialogue-related features (such as number of human turns or number of sentences per robot utterance) correlate with perceiving a robot as intelligent. In addition, we show that facial expression, emotional features, and prosody are better predictors of human ratings related to perceived robot likeability and anthropomorphism, while linguistic and non-linguistic features more often predict perceived robot intelligence and interpretability. As such, these characteristics may in future be used as an online reward signal for in-situ Reinforcement Learning based adaptive human-robot dialogue systems.Comment: Robo-NLP workshop at ACL 2017. 9 pages, 5 figures, 6 table

    Speakers Raise their Hands and Head during Self-Repairs in Dyadic Conversations

    Get PDF
    People often encounter difficulties in building shared understanding during everyday conversation. The most common symptom of these difficulties are self-repairs, when a speaker restarts, edits or amends their utterances mid-turn. Previous work has focused on the verbal signals of self-repair, i.e. speech disfluences (filled pauses, truncated words and phrases, word substitutions or reformulations), and computational tools now exist that can automatically detect these verbal phenomena. However, face-to-face conversation also exploits rich non-verbal resources and previous research suggests that self-repairs are associated with distinct hand movement patterns. This paper extends those results by exploring head and hand movements of both speakers and listeners using two motion parameters: height (vertical position) and 3D velocity. The results show that speech sequences containing self-repairs are distinguishable from fluent ones: speakers raise their hands and head more (and move more rapidly) during self-repairs. We obtain these results by analysing data from a corpus of 13 unscripted dialogues, and we discuss how these findings could support the creation of improved cognitive artificial systems for natural human-machine and human-robot interaction

    Computational Models of Miscommunication Phenomena

    Get PDF
    Miscommunication phenomena such as repair in dialogue are important indicators of the quality of communication. Automatic detection is therefore a key step toward tools that can characterize communication quality and thus help in applications from call center management to mental health monitoring. However, most existing computational linguistic approaches to these phenomena are unsuitable for general use in this way, and particularly for analyzing human–human dialogue: Although models of other-repair are common in human-computer dialogue systems, they tend to focus on specific phenomena (e.g., repair initiation by systems), missing the range of repair and repair initiation forms used by humans; and while self-repair models for speech recognition and understanding are advanced, they tend to focus on removal of “disfluent” material important for full understanding of the discourse contribution, and/or rely on domain-specific knowledge. We explain the requirements for more satisfactory models, including incrementality of processing and robustness to sparsity. We then describe models for self- and other-repair detection that meet these requirements (for the former, an adaptation of an existing repair model; for the latter, an adaptation of standard techniques) and investigate how they perform on datasets from a range of dialogue genres and domains, with promising results.EPSRC. Grant Number: EP/10383/1; Future and Emerging Technologies (FET). Grant Number: 611733; German Research Foundation (DFG). Grant Number: SCHL 845/5-1; Swedish Research Council (VR). Grant Numbers: 2016-0116, 2014-3

    Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

    Full text link
    This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors. In this study, our main focus is on assessing the human-likeness of the robot as the primary evaluation metric. While previous research often relied on subjective evaluations from users, our approach aims to evaluate the robot's human-likeness based on observable user behaviors indirectly, thus enhancing objectivity and reproducibility. To begin, we created an annotated dataset of human-likeness scores, utilizing user behaviors found in an attentive listening dialogue corpus. We then conducted an analysis to determine the correlation between multimodal user behaviors and human-likeness scores, demonstrating the feasibility of our proposed behavior-based evaluation method.Comment: Accepted by 25th ACM International Conference on Multimodal Interaction (ICMI '23), Late-Breaking Result
    • 

    corecore