4 research outputs found

    Better Driving and Recall When In-car Information Presentation Uses Situationally-Aware Incremental Speech Output Generation

    Get PDF
    Kennington C, Kousidis S, Baumann T, Buschmeier H, Kopp S, Schlangen D. Better Driving and Recall When In-car Information Presentation Uses Situationally-Aware Incremental Speech Output Generation. In: AutomotiveUI 2014: Proceedings of the 6th International Conference on Automotive User Interfaces and Interactive Vehicular Applications. Seattle, Washington, USA; 2014: 7:1-7:7.It is established that driver distraction is the result of sharing cognitive resources between the primary task (driving) and any other secondary task. In the case of holding conversations, a human passenger who is aware of the driving conditions can choose to interrupt his speech in situations potentially requiring more attention from the driver, but in-car information systems typically do not exhibit such sensitivity. We have designed and tested such a system in a driving simulation environment. Unlike other systems, our system delivers infor- mation via speech (calendar entries with scheduled meetings) but is able to react to signals from the environment to interrupt when the driver needs to be fully attentive to the driving task and subsequently resume its delivery. Distraction is measured by a secondary short-term memory task. In both tasks, drivers perform significantly worse when the system does not adapt its speech, while they perform equally well to control conditions (no concurrent task) when the system intelligently interrupts and resumes

    An architecture for fluid real-time conversational agents: Integrating incremental output generation and input processing

    Get PDF
    Kopp S, van Welbergen H, Yaghoubzadeh R, Buschmeier H. An architecture for fluid real-time conversational agents: Integrating incremental output generation and input processing. Journal on Multimodal User Interfaces. 2014;8:97-108.Embodied conversational agents still do not achieve the fluidity and smoothness of natural conversational interaction. One main reason is that current system often respond with big latencies and in inflexible ways. We argue that to overcome these problems, real-time conversational agents need to be based on an underlying architecture that provides two essential features for fast and fluent behavior adaptation: a close bi-directional coordination between input processing and output generation, and incrementality of processing at both stages. We propose an architectural framework for conversational agents [Artificial Social Agent Platform (ASAP)] providing these two ingredients for fluid real-time conversation. The overall architectural concept is described, along with specific means of specifying incremental behavior in BML and technical implementations of different modules. We show how phenomena of fluid real- time conversation, like adapting to user feedback or smooth turn-keeping, can be realized with ASAP and we describe in detail an example real-time interaction with the implemented system

    Combining Incremental Language Generation and Incremental Speech Synthesis for Adaptive Information Presentation

    Get PDF
    Buschmeier H, Baumann T, Dosch B, Kopp S, Schlangen D. Combining Incremental Language Generation and Incremental Speech Synthesis for Adaptive Information Presentation. In: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Seoul, South Korea; 2012: 295-303.Participants in a conversation are normally receptive to their surroundings and their interlocutors, even while they are speaking and can, if necessary, adapt their ongoing utterance. Typical dialogue systems are not receptive and cannot adapt while uttering. We present combinable components for incremental natural language generation and incremental speech synthesis and demonstrate the flexibility they can achieve with an example system that adapts to a listener's acoustic understanding problems by pausing, repeating and possibly rephrasing problematic parts of an utterance. In an evaluation, this system was rated as significantly more natural than two systems representing the current state of the art that either ignore the interrupting event or just pause; it also has a lower response time. Video of talk available here: http://www.superlectures.com/sigdial2012/lecture.php?lang=en&id=1
    corecore