2,367 research outputs found

    A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction

    Full text link
    Picking up objects requested by a human user is a common task in human-robot interaction. When multiple objects match the user's verbal description, the robot needs to clarify which object the user is referring to before executing the action. Previous research has focused on perceiving user's multimodal behaviour to complement verbal commands or minimising the number of follow up questions to reduce task time. In this paper, we propose a system for reference disambiguation based on visualisation and compare three methods to disambiguate natural language instructions. In a controlled experiment with a YuMi robot, we investigated real-time augmentations of the workspace in three conditions -- mixed reality, augmented reality, and a monitor as the baseline -- using objective measures such as time and accuracy, and subjective measures like engagement, immersion, and display interference. Significant differences were found in accuracy and engagement between the conditions, but no differences were found in task time. Despite the higher error rates in the mixed reality condition, participants found that modality more engaging than the other two, but overall showed preference for the augmented reality condition over the monitor and mixed reality conditions

    Designing Disambiguation Techniques for Pointing in the Physical World

    Get PDF
    International audienceSeveral ways for selecting physical objects exist, including touching and pointing at them. Allowing the user to interact at a distance by pointing at physical objects can be challenging when the environment contains a large number of interactive physical objects, possibly occluded by other everyday items. Previous pointing techniques highlighted the need for disambiguation techniques. Addressing this challenge, this paper contributes a design space that organizes along groups and axes a set of options for designers to relevantly (1) describe, (2) classify, and (3) design disambiguation techniques. First, we have not found techniques in the literature yet that our design space could not describe. Second, all the techniques show a different path along the axes of our design space. Third, it allows defining of several new paths/solutions that have not yet been explored. We illustrate this generative power with the example of such a designed technique, Physical Pointing Roll (P2Roll)

    Multimodal fusion : gesture and speech input in augmented reality environment

    Get PDF
    Augmented Reality (AR) has the capability to interact with the virtual objects and physical objects simultaneously since it combines the real world with virtual world seamlessly. However, most AR interface applies conventional Virtual Reality (VR) interaction techniques without modification. In this paper we explore the multimodal fusion for AR with speech and hand gesture input. Multimodal fusion enables users to interact with computers through various input modalities like speech, gesture, and eye gaze. At the first stage to propose the multimodal interaction, the input modalities are decided to be selected before be integrated in an interface. The paper presents several related works about to recap the multimodal approaches until it recently has been one of the research trends in AR. It presents the assorted existing works in multimodal for VR and AR. In AR, multimodal considers as the solution to improve the interaction between the virtual and physical entities. It is an ideal interaction technique for AR applications since AR supports interactions in real and virtual worlds in the real-time. This paper describes the recent studies in AR developments that appeal gesture and speech inputs. It looks into multimodal fusion and its developments, followed by the conclusion.This paper will give a guideline on multimodal fusion on how to integrate the gesture and speech inputs in AR environment

    Evaluation of AI-Supported Input Methods in Augmented Reality Environment

    Full text link
    Augmented Reality (AR) solutions are providing tools that could improve applications in the medical and industrial fields. Augmentation can provide additional information in training, visualization, and work scenarios, to increase efficiency, reliability, and safety, while improving communication with other devices and systems on the network. Unfortunately, tasks in these fields often require both hands to execute, reducing the variety of input methods suitable to control AR applications. People with certain physical disabilities, where they are not able to use their hands, are also negatively impacted when using these devices. The goal of this work is to provide novel hand-free interfacing methods, using AR technology, in association with AI support approaches to produce an improved Human-Computer interaction solution

    An Evaluation of an Augmented Reality Multimodal Interface Using Speech and Paddle Gestures

    Get PDF
    This paper discusses an evaluation of an augmented reality (AR) multimodal interface that uses combined speech and paddle gestures for interaction with virtual objects in the real world. We briefly describe our AR multimodal interface architecture and multimodal fusion strategies that are based on the combination of time-based and domain semantics. Then, we present the results from a user study comparing using multimodal input to using gesture input alone. The results show that a combination of speech and paddle gestures improves the efficiency of user interaction. Finally, we describe some design recommendations for developing other multimodal AR interfaces

    Using triangulation to identify word senses

    No full text
    Word sense disambiguation is the task of determining which sense of a word is intended from its context. Previous methods have found the lack of training data and the restrictiveness of dictionaries' choices of senses to be major stumbling blocks. A robust novel algorithm is presented that uses multiple dictionaries, the Internet, clustering and triangulation to attempt to discern the most useful senses of a given word and learn how they can be disambiguated. The algorithm is explained, and some promising sample results are given

    Gaze modulated disambiguation technique for gesture control in 3D virtual objects selection

    Get PDF
    © 2017 IEEE. Inputs with multimodal information provide more natural ways to interact with virtual 3D environment. An emerging technique that integrates gaze modulated pointing with mid-air gesture control enables fast target acquisition and rich control expressions. The performance of this technique relies on the eye tracking accuracy which is not comparable with the traditional pointing techniques (e.g., mouse) yet. This will cause troubles when fine grainy interactions are required, such as selecting in a dense virtual scene where proximity and occlusion are prone to occur. This paper proposes a coarse-to-fine solution to compensate the degradation introduced by eye tracking inaccuracy using a gaze cone to detect ambiguity and then a gaze probe for decluttering. It is tested in a comparative experiment which involves 12 participants with 3240 runs. The results show that the proposed technique enhanced the selection accuracy and user experience but it is still with a potential to be improved in efficiency. This study contributes to providing a robust multimodal interface design supported by both eye tracking and mid-air gesture control

    Cross-Dimensional Gestural Interaction Techniques for Hybrid Immersive Environments

    Get PDF
    We present a set of interaction techniques for a hybrid user interface that integrates existing 2D and 3D visualization and interaction devices. Our approach is built around one- and two-handed gestures that support the seamless transition of data between co-located 2D and 3D contexts. Our testbed environment combines a 2D multi-user, multi-touch, projection surface with 3D head-tracked, see-through, head-worn displays and 3D tracked gloves to form a multi-display augmented reality. We also address some of the ways in which we can interact with private data in a collaborative, heterogeneous workspace

    Semantic web and augmented reality for searching people, events and points of interest within of a university campus

    Get PDF
    The advance of technology makes that mobile devices have gained widespread popularity. Modern mobile devices include build in sensors, cameras, compasses, and enhanced storage and processing capabilities, which allow developers to use those features to create applications with new or enhanced functionality. In this context we present a mobile application for searching places, people an events within a university campus. In our work we leverage semantic web and augmented reality to provide an application with a high degree of query expressiveness and an enhanced user experience. In addition, we validate our approach with a use case example that shows the complete searching process.Sociedad Argentina de Informática e Investigación Operativa (SADIO
    corecore