502 research outputs found

    Prosodic modules for speech recognition and understanding in VERBMOBIL

    Get PDF
    Within VERBMOBIL, a large project on spoken language research in Germany, two modules for detecting and recognizing prosodic events have been developed. One module operates on speech signal parameters and the word hypothesis graph, whereas the other module, designed for a novel, highly interactive architecture, only uses speech signal parameters as its input. Phrase boundaries, sentence modality, and accents are detected. The recognition rates in spontaneous dialogs are for accents up to 82,5%, for phrase boundaries up to 91,7%

    Fully generated scripted dialogue for embodied agents

    Get PDF
    This paper presents the NECA approach to the generation of dialogues between Embodied Conversational Agents (ECAs). This approach consist of the automated construction of an abstract script for an entire dialogue (cast in terms of dialogue acts), which is incrementally enhanced by a series of modules and finally ''performed'' by means of text, speech and body language, by a cast of ECAs. The approach makes it possible to automatically produce a large variety of highly expressive dialogues, some of whose essential properties are under the control of a user. The paper discusses the advantages and disadvantages of NECA's approach to Fully Generated Scripted Dialogue (FGSD), and explains the main techniques used in the two demonstrators that were built. The paper can be read as a survey of issues and techniques in the construction of ECAs, focusing on the generation of behaviour (i.e., focusing on information presentation) rather than on interpretation

    Continuous Interaction with a Virtual Human

    Get PDF
    Attentive Speaking and Active Listening require that a Virtual Human be capable of simultaneous perception/interpretation and production of communicative behavior. A Virtual Human should be able to signal its attitude and attention while it is listening to its interaction partner, and be able to attend to its interaction partner while it is speaking – and modify its communicative behavior on-the-fly based on what it perceives from its partner. This report presents the results of a four week summer project that was part of eNTERFACE’10. The project resulted in progress on several aspects of continuous interaction such as scheduling and interrupting multimodal behavior, automatic classification of listener responses, generation of response eliciting behavior, and models for appropriate reactions to listener responses. A pilot user study was conducted with ten participants. In addition, the project yielded a number of deliverables that are released for public access

    Incremental Syllable-Context Phonetic Vocoding

    Full text link

    Better Driving and Recall When In-car Information Presentation Uses Situationally-Aware Incremental Speech Output Generation

    Get PDF
    Kennington C, Kousidis S, Baumann T, Buschmeier H, Kopp S, Schlangen D. Better Driving and Recall When In-car Information Presentation Uses Situationally-Aware Incremental Speech Output Generation. In: AutomotiveUI 2014: Proceedings of the 6th International Conference on Automotive User Interfaces and Interactive Vehicular Applications. Seattle, Washington, USA; 2014: 7:1-7:7.It is established that driver distraction is the result of sharing cognitive resources between the primary task (driving) and any other secondary task. In the case of holding conversations, a human passenger who is aware of the driving conditions can choose to interrupt his speech in situations potentially requiring more attention from the driver, but in-car information systems typically do not exhibit such sensitivity. We have designed and tested such a system in a driving simulation environment. Unlike other systems, our system delivers infor- mation via speech (calendar entries with scheduled meetings) but is able to react to signals from the environment to interrupt when the driver needs to be fully attentive to the driving task and subsequently resume its delivery. Distraction is measured by a secondary short-term memory task. In both tasks, drivers perform significantly worse when the system does not adapt its speech, while they perform equally well to control conditions (no concurrent task) when the system intelligently interrupts and resumes
    corecore