14 research outputs found

    Towards Computer Understanding of Human Interactions

    Get PDF
    People meet in order to interact - disseminating information, making decisions, and creating new ideas. Automatic analysis of meetings is therefore important from two points of view: extracting the information they contain, and understanding human interaction processes. Based on this view, this article presents an approach in which relevant information content of a meeting is identified from a variety of audio and visual sensor inputs and statistical models of interacting people. We present a framework for computer observation and understanding of interacting people, and discuss particular tasks within this framework, issues in the meeting context, and particular algorithms that we have adopted. We also comment on current developments and the future challenges in automatic meeting analysis

    Towards Simulating Humans in Augmented Multi-party Interaction

    Get PDF
    Human-computer interaction requires modeling of the user. A user profile typically contains preferences, interests, characteristics, and interaction behavior. However, in its multimodal interaction with a smart environment the user displays characteristics that show how the user, not necessarily consciously, verbally and nonverbally provides the smart environment with useful input and feedback. Especially in ambient intelligence environments we encounter situations where the environment supports interaction between the environment, smart objects (e.g., mobile robots, smart furniture) and human participants in the environment. Therefore it is useful for the profile to contain a physical representation of the user obtained by multi-modal capturing techniques. We discuss the modeling and simulation of interacting participants in the European AMI research project

    Mixed reality participants in smart meeting rooms and smart home enviroments

    Get PDF
    Human–computer interaction requires modeling of the user. A user profile typically contains preferences, interests, characteristics, and interaction behavior. However, in its multimodal interaction with a smart environment the user displays characteristics that show how the user, not necessarily consciously, verbally and nonverbally provides the smart environment with useful input and feedback. Especially in ambient intelligence environments we encounter situations where the environment supports interaction between the environment, smart objects (e.g., mobile robots, smart furniture) and human participants in the environment. Therefore it is useful for the profile to contain a physical representation of the user obtained by multi-modal capturing techniques. We discuss the modeling and simulation of interacting participants in a virtual meeting room, we discuss how remote meeting participants can take part in meeting activities and they have some observations on translating research results to smart home environments

    Human and Virtual Agents Interacting in the Virtuality Continuum

    Get PDF

    Content-based access to spoken audio

    Get PDF
    The amount of archived audio material in digital form is increasing rapidly, as advantage is taken of the growth in available storage and processing power. Computational resources are becoming less of a bottleneck to digitally record and archive vast amounts of spoken material, both television and radio broadcasts and individual conversations. However, listening to this ever-growing amount of spoken audio sequentially is too slow, and the bottleneck will become the development of effective ways to access content in these voluminous archives. The provision of accurate and efficient computer-mediated content access is a challenging task, because spoken audio combines information from multiple levels (phonetic, acoustic, syntactic, semantic and discourse). Most systems that assist humans in accessing spoken audio content have approached the problem by performing automatic speech recognition, followed by text-based information access. These systems have addressed diverse tasks including indexing and retrieving voicemail messages, searching for broadcast news, and extracting information from recordings of meetings and lectures. Spoken audio content is far richer than what a simple textual transcription can capture as it has additional cues that disclose the intended meaning and speaker’s emotional state. However, the text transcription alone still provides a great deal of useful information in applications. This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also discuss the main application domains and try to identify important issues for future developments. The structure of the article is based on general system architecture for content-based access which is depicted in Figure 1. Although the tasks within each processing stage may appear unconnected, the interdependencies and the sequence with which they take place vary

    Activity Report 2004

    Get PDF

    Human-human multi-threaded spoken dialogs in the presence of driving

    Get PDF
    The problem addressed in this research is that engineers looking for interface designs do not have enough data about the interaction between multi-threaded dialogs and manual-visual tasks. Our goal was to investigate this interaction. We proposed to analyze how humans handle multi-threaded dialogs while engaged in a manual-visual task. More specifically, we looked at the interaction between performance on two spoken tasks and driving. The novelty of this dissertation is in its focus on the intersection between a manual-visual task and a multi-threaded speech communication between two humans. We proposed an experiment setup that is suitable for investigating multi-threaded spoken dialogs while subjects are involved in a manual-visual task. In our experiments one participant drove a simulated vehicle while talking with another participant located in a different room. The participants communicated using headphones and microphones. Both participants performed an ongoing task, which was interrupted by an interrupting task. Both tasks, the ongoing task and the interrupting task, were done using speech. We collected corpora of annotated data from our experiments and analyzed the data to verify the suitability of the proposed experiment setup. We found that, as expected, driving and our spoken tasks influenced each other. We also found that the timing of interruption influenced the spoken tasks. Unexpectedly, the data indicate that the ongoing task was more influenced by driving than the interrupting task. On the other hand, the interrupting task influenced driving more than the ongoing task. This suggests that the multiple resource model [1] does not capture the complexity of the interactions between the manual-visual and spoken tasks. We proposed that the perceived urgency or the perceived task difficulty plays a role in how the tasks influence each other
    corecore