1,042 research outputs found

    An Evaluation of an Augmented Reality Multimodal Interface Using Speech and Paddle Gestures

    Get PDF
    This paper discusses an evaluation of an augmented reality (AR) multimodal interface that uses combined speech and paddle gestures for interaction with virtual objects in the real world. We briefly describe our AR multimodal interface architecture and multimodal fusion strategies that are based on the combination of time-based and domain semantics. Then, we present the results from a user study comparing using multimodal input to using gesture input alone. The results show that a combination of speech and paddle gestures improves the efficiency of user interaction. Finally, we describe some design recommendations for developing other multimodal AR interfaces

    Natural interaction with a virtual guide in a virtual environment: A multimodal dialogue system

    Get PDF
    This paper describes the Virtual Guide, a multimodal dialogue system represented by an embodied conversational agent that can help users to find their way in a virtual environment, while adapting its affective linguistic style to that of the user. We discuss the modular architecture of the system, and describe the entire loop from multimodal input analysis to multimodal output generation. We also describe how the Virtual Guide detects the level of politeness of the user’s utterances in real-time during the dialogue and aligns its own language to that of the user, using different politeness strategies. Finally we report on our first user tests, and discuss some potential extensions to improve the system

    Context-aware gestural interaction in the smart environments of the ubiquitous computing era

    Get PDF
    A thesis submitted to the University of Bedfordshire in partial fulfilment of the requirements for the degree of Doctor of PhilosophyTechnology is becoming pervasive and the current interfaces are not adequate for the interaction with the smart environments of the ubiquitous computing era. Recently, researchers have started to address this issue introducing the concept of natural user interface, which is mainly based on gestural interactions. Many issues are still open in this emerging domain and, in particular, there is a lack of common guidelines for coherent implementation of gestural interfaces. This research investigates gestural interactions between humans and smart environments. It proposes a novel framework for the high-level organization of the context information. The framework is conceived to provide the support for a novel approach using functional gestures to reduce the gesture ambiguity and the number of gestures in taxonomies and improve the usability. In order to validate this framework, a proof-of-concept has been developed. A prototype has been developed by implementing a novel method for the view-invariant recognition of deictic and dynamic gestures. Tests have been conducted to assess the gesture recognition accuracy and the usability of the interfaces developed following the proposed framework. The results show that the method provides optimal gesture recognition from very different view-points whilst the usability tests have yielded high scores. Further investigation on the context information has been performed tackling the problem of user status. It is intended as human activity and a technique based on an innovative application of electromyography is proposed. The tests show that the proposed technique has achieved good activity recognition accuracy. The context is treated also as system status. In ubiquitous computing, the system can adopt different paradigms: wearable, environmental and pervasive. A novel paradigm, called synergistic paradigm, is presented combining the advantages of the wearable and environmental paradigms. Moreover, it augments the interaction possibilities of the user and ensures better gesture recognition accuracy than with the other paradigms

    Proceedings of the international conference on cooperative multimodal communication CMC/95, Eindhoven, May 24-26, 1995:proceedings

    Get PDF

    A Voice and Pointing Gesture Interaction System for Supporting Human Spontaneous Decisions in Autonomous Cars

    Get PDF
    Autonomous cars are expected to improve road safety, traffic and mobility. It is projected that in the next 20-30 years fully autonomous vehicles will be on the market. The advancement on the research and development of this technology will allow the disengagement of humans from the driving task, which will be responsibility of the vehicle intelligence. In this scenario new vehicle interior designs are proposed, enabling more flexible human vehicle interactions inside them. In addition, as some important stakeholders propose, control elements such as the steering wheel and accelerator and brake pedals may not be needed any longer. However, this user control disengagement is one of the main issues related with the user acceptance of this technology. Users do not seem to be comfortable with the idea of giving all the decision power to the vehicle. In addition, there can be location awareness situations where the user makes a spontaneous decision and requires some type of vehicle control. Such is the case of stopping at a particular point of interest or taking a detour in the pre-calculated autonomous route of the car. Vehicle manufacturers\u27 maintain the steering wheel as a control element, allowing the driver to take over the vehicle if needed or wanted. This causes a constraint in the previously mentioned human vehicle interaction flexibility. Thus, there is an unsolved dilemma between providing users enough control over the autonomous vehicle and route so they can make spontaneous decision, and interaction flexibility inside the car. This dissertation proposes the use of a voice and pointing gesture human vehicle interaction system to solve this dilemma. Voice and pointing gestures have been identified as natural interaction techniques to guide and command mobile robots, potentially providing the needed user control over the car. On the other hand, they can be executed anywhere inside the vehicle, enabling interaction flexibility. The objective of this dissertation is to provide a strategy to support this system. For this, a method based on pointing rays intersections for the computation of the point of interest (POI) that the user is pointing to is developed. Simulation results show that this POI computation method outperforms the traditional ray-casting based by 76.5% in cluttered environments and 36.25% in combined cluttered and non-cluttered scenarios. The whole system is developed and demonstrated using a robotics simulator framework. The simulations show how voice and pointing commands performed by the user update the predefined autonomous path, based on the recognized command semantics. In addition, a dialog feedback strategy is proposed to solve conflicting situations such as ambiguity in the POI identification. This additional step is able to solve all the previously mentioned POI computation inaccuracies. In addition, it allows the user to confirm, correct or reject the performed commands in case the system misunderstands them

    Recognition of Emotions using Energy Based Bimodal Information Fusion and Correlation

    Get PDF
    Multi-sensor information fusion is a rapidly developing research area which forms the backbone of numerous essential technologies such as intelligent robotic control, sensor networks, video and image processing and many more. In this paper, we have developed a novel technique to analyze and correlate human emotions expressed in voice tone & facial expression. Audio and video streams captured to populate audio and video bimodal data sets to sense the expressed emotions in voice tone and facial expression respectively. An energy based mapping is being done to overcome the inherent heterogeneity of the recorded bi-modal signal. The fusion process uses sampled and mapped energy signal of both modalities’s data stream and further recognize the overall emotional component using Support Vector Machine (SVM) classifier with the accuracy 93.06%

    Attention-controlled acquisition of a qualitative scene model for mobile robots

    Get PDF
    Haasch A. Attention-controlled acquisition of a qualitative scene model for mobile robots. Bielefeld (Germany): Bielefeld University; 2007.Robots that are used to support humans in dangerous environments, e.g., in manufacture facilities, are established for decades. Now, a new generation of service robots is focus of current research and about to be introduced. These intelligent service robots are intended to support humans in everyday life. To achieve a most comfortable human-robot interaction with non-expert users it is, thus, imperative for the acceptance of such robots to provide interaction interfaces that we humans are accustomed to in comparison to human-human communication. Consequently, intuitive modalities like gestures or spontaneous speech are needed to teach the robot previously unknown objects and locations. Then, the robot can be entrusted with tasks like fetch-and-carry orders even without an extensive training of the user. In this context, this dissertation introduces the multimodal Object Attention System which offers a flexible integration of common interaction modalities in combination with state-of-the-art image and speech processing techniques from other research projects. To prove the feasibility of the approach the presented Object Attention System has successfully been integrated in different robotic hardware. In particular, the mobile robot BIRON and the anthropomorphic robot BARTHOC of the Applied Computer Science Group at Bielefeld University. Concluding, the aim of this work, to acquire a qualitative Scene Model by a modular component offering object attention mechanisms, has been successfully achieved as demonstrated on numerous occasions like reviews for the EU-integrated Project COGNIRON or demos

    An extensible architecture for robust multimodal human-robot communication

    Get PDF
    Abstract-Human safety and effective human-robot communication are main concerns in HRI applications. In order to achieve such goals, a system should be very robust, allowing little chance for misunderstanding the user's commands. Moreover, the system should permit natural interaction reducing the time and the effort needed to achieve tasks. The main purpose of this work is to develop a general framework for flexible and multimodal human-robot communication. The proposed architecture should be easy to modify and expand, adding or modifying input channels and changing the multimodal fusion strategies. In this paper, we introduce our general approach and provide a case study with two modalities (gesture and speech)
    corecore