400 research outputs found

    Natural interaction with a virtual guide in a virtual environment: A multimodal dialogue system

    Get PDF
    This paper describes the Virtual Guide, a multimodal dialogue system represented by an embodied conversational agent that can help users to find their way in a virtual environment, while adapting its affective linguistic style to that of the user. We discuss the modular architecture of the system, and describe the entire loop from multimodal input analysis to multimodal output generation. We also describe how the Virtual Guide detects the level of politeness of the user’s utterances in real-time during the dialogue and aligns its own language to that of the user, using different politeness strategies. Finally we report on our first user tests, and discuss some potential extensions to improve the system

    What and where : an empirical investigation of pointing gestures and descriptions in multimodal referring actions

    Get PDF
    Pointing gestures are pervasive in human referring actions, and are often combined with spoken descriptions. Combining gesture and speech naturally to refer to objects is an essential task in multimodal NLG systems. However, the way gesture and speech should be combined in a referring act remains an open question. In particular, it is not clear whether, in planning a pointing gesture in conjunction with a description, an NLG system should seek to minimise the redundancy between them, e.g. by letting the pointing gesture indicate locative information, with other, nonlocative properties of a referent included in the description. This question has a bearing on whether the gestural and spoken parts of referring acts are planned separately or arise from a common underlying computational mechanism. This paper investigates this question empirically, using machine-learning techniques on a new corpus of dialogues involving multimodal references to objects. Our results indicate that human pointing strategies interact with descriptive strategies. In particular, pointing gestures are strongly associated with the use of locative features in referring expressions.peer-reviewe

    Developing Intelligent MultiMedia applications

    Get PDF

    Gestures in Machine Interaction

    Full text link
    Vnencumbered-gesture-interaction (VGI) describes the use of unrestricted gestures in machine interaction. The development of such technology will enable users to interact with machines and virtual environments by performing actions like grasping, pinching or waving without the need of peripherals. Advances in image-processing and pattern recognition make such interaction viable and in some applications more practical than current modes of keyboard, mouse and touch-screen interaction provide. VGI is emerging as a popular topic amongst Human-Computer Interaction (HCI), Computer-vision and gesture research; and is developing into a topic with potential to significantly impact the future of computer-interaction, robot-control and gaming. This thesis investigates whether an ergonomic model of VGI can be developed and implemented on consumer devices by considering some of the barriers currently preventing such a model of VGI from being widely adopted. This research aims to address the development of freehand gesture interfaces and accompanying syntax. Without the detailed consideration of the evolution of this field the development of un-ergonomic, inefficient interfaces capable of placing undue strain on interface users becomes more likely. In the course of this thesis some novel design and methodological assertions are made. The Gesture in Machine Interaction (GiMI) syntax model and the Gesture-Face Layer (GFL), developed in the course of this research, have been designed to facilitate ergonomic gesture interaction. The GiMI is an interface syntax model designed to enable cursor control, browser navigation commands and steering control for remote robots or vehicles. Through applying state-of-the-art image processing that facilitates three-dimensional (3D) recognition of human action, this research investigates how interface syntax can incorporate the broadest range of human actions. By advancing our understanding of ergonomic gesture syntax, this research aims to assist future developers evaluate the efficiency of gesture interfaces, lexicons and syntax

    Auto clustering for unsupervised learning of atomic gesture components using minimum description length

    Get PDF
    We present an approach to automatically segment and label a continuous observation sequence of hand gestures for a complete unsupervised model acquisition. The method is based on the assumption that gestures can be viewed as repetitive sequences of atomic components, similar to phonemes in speech, governed by a high level structure controlling the temporal sequence. We show that the generating process for the atomic components can be described in gesture space by a mixture of Gaussian, with each mixture component tied to one atomic behaviour. Mixture components are determined using a standard EM approach while the determination of the number of components is based on an information criteria, the Minimum Description Length

    MAGIC: Manipulating Avatars and Gestures to Improve Remote Collaboration

    Get PDF
    Remote collaborative work has become pervasive in many settings, from engineering to medical professions. Users are immersed in virtual environments and communicate through life-sized avatars that enable face-to-face collaboration. Within this context, users often collaboratively view and interact with virtual 3D models, for example, to assist in designing new devices such as customized prosthetics, vehicles, or buildings. However, discussing shared 3D content face-to-face has various challenges, such as ambiguities, occlusions, and different viewpoints that all decrease mutual awareness, leading to decreased task performance and increased errors. To address this challenge, we introduce MAGIC, a novel approach for understanding pointing gestures in a face-to-face shared 3D space, improving mutual understanding and awareness. Our approach distorts the remote user\'s gestures to correctly reflect them in the local user\'s reference space when face-to-face. We introduce a novel metric called pointing agreement to measure what two users perceive in common when using pointing gestures in a shared 3D space. Results from a user study suggest that MAGIC significantly improves pointing agreement in face-to-face collaboration settings, improving co-presence and awareness of interactions performed in the shared space. We believe that MAGIC improves remote collaboration by enabling simpler communication mechanisms and better mutual awareness.Comment: Presented at IEEE VR 202

    Systematic literature review of hand gestures used in human computer interaction interfaces

    Get PDF
    Gestures, widely accepted as a humans' natural mode of interaction with their surroundings, have been considered for use in human-computer based interfaces since the early 1980s. They have been explored and implemented, with a range of success and maturity levels, in a variety of fields, facilitated by a multitude of technologies. Underpinning gesture theory however focuses on gestures performed simultaneously with speech, and majority of gesture based interfaces are supported by other modes of interaction. This article reports the results of a systematic review undertaken to identify characteristics of touchless/in-air hand gestures used in interaction interfaces. 148 articles were reviewed reporting on gesture-based interaction interfaces, identified through searching engineering and science databases (Engineering Village, Pro Quest, Science Direct, Scopus and Web of Science). The goal of the review was to map the field of gesture-based interfaces, investigate the patterns in gesture use, and identify common combinations of gestures for different combinations of applications and technologies. From the review, the community seems disparate with little evidence of building upon prior work and a fundamental framework of gesture-based interaction is not evident. However, the findings can help inform future developments and provide valuable information about the benefits and drawbacks of different approaches. It was further found that the nature and appropriateness of gestures used was not a primary factor in gesture elicitation when designing gesture based systems, and that ease of technology implementation often took precedence

    Application-driven visual computing towards industry 4.0 2018

    Get PDF
    245 p.La Tesis recoge contribuciones en tres campos: 1. Agentes Virtuales Interactivos: autónomos, modulares, escalables, ubicuos y atractivos para el usuario. Estos IVA pueden interactuar con los usuarios de manera natural.2. Entornos de RV/RA Inmersivos: RV en la planificación de la producción, el diseño de producto, la simulación de procesos, pruebas y verificación. El Operario Virtual muestra cómo la RV y los Co-bots pueden trabajar en un entorno seguro. En el Operario Aumentado la RA muestra información relevante al trabajador de una manera no intrusiva. 3. Gestión Interactiva de Modelos 3D: gestión online y visualización de modelos CAD multimedia, mediante conversión automática de modelos CAD a la Web. La tecnología Web3D permite la visualización e interacción de estos modelos en dispositivos móviles de baja potencia.Además, estas contribuciones han permitido analizar los desafíos presentados por Industry 4.0. La tesis ha contribuido a proporcionar una prueba de concepto para algunos de esos desafíos: en factores humanos, simulación, visualización e integración de modelos

    An empirical investigation of gaze selection in mid-air gestural 3D manipulation

    Get PDF
    In this work, we investigate gaze selection in the context of mid-air hand gestural manipulation of 3D rigid bodies on monoscopic displays. We present the results of a user study with 12 participants in which we compared the performance of Gaze, a Raycasting technique (2D Cursor) and a Virtual Hand technique (3D Cursor) to select objects in two 3D mid-air interaction tasks. Also, we compared selection confirmation times for Gaze selection when selection is followed by manipulation to when it is not. Our results show that gaze selection is faster and more preferred than 2D and 3D mid-air-controlled cursors, and is particularly well suited for tasks in which users constantly switch between several objects during the manipulation. Further, selection confirmation times are longer when selection is followed by manipulation than when it is not
    corecore