783 research outputs found

    Gestures in human-robot interaction

    Get PDF
    Gesten sind ein Kommunikationsweg, der einem Betrachter Informationen oder Absichten übermittelt. Daher können sie effektiv in der Mensch-Roboter-Interaktion, oder in der Mensch-Maschine-Interaktion allgemein, verwendet werden. Sie stellen eine Möglichkeit für einen Roboter oder eine Maschine dar, um eine Bedeutung abzuleiten. Um Gesten intuitiv benutzen zukönnen und Gesten, die von Robotern ausgeführt werden, zu verstehen, ist es notwendig, Zuordnungen zwischen Gesten und den damit verbundenen Bedeutungen zu definieren -- ein Gestenvokabular. Ein Menschgestenvokabular definiert welche Gesten ein Personenkreis intuitiv verwendet, um Informationen zu übermitteln. Ein Robotergestenvokabular zeigt welche Robotergesten zu welcher Bedeutung passen. Ihre effektive und intuitive Benutzung hängt von Gestenerkennung ab, das heißt von der Klassifizierung der Körperbewegung in diskrete Gestenklassen durch die Verwendung von Mustererkennung und maschinellem Lernen. Die vorliegende Dissertation befasst sich mit beiden Forschungsbereichen. Als eine Voraussetzung für die intuitive Mensch-Roboter-Interaktion wird zunächst ein Aufmerksamkeitsmodell für humanoide Roboter entwickelt. Danach wird ein Verfahren für die Festlegung von Gestenvokabulare vorgelegt, das auf Beobachtungen von Benutzern und Umfragen beruht. Anschliessend werden experimentelle Ergebnisse vorgestellt. Eine Methode zur Verfeinerung der Robotergesten wird entwickelt, die auf interaktiven genetischen Algorithmen basiert. Ein robuster und performanter Gestenerkennungsalgorithmus wird entwickelt, der auf Dynamic Time Warping basiert, und sich durch die Verwendung von One-Shot-Learning auszeichnet, das heißt durch die Verwendung einer geringen Anzahl von Trainingsgesten. Der Algorithmus kann in realen Szenarien verwendet werden, womit er den Einfluss von Umweltbedingungen und Gesteneigenschaften, senkt. Schließlich wird eine Methode für das Lernen der Beziehungen zwischen Selbstbewegung und Zeigegesten vorgestellt.Gestures consist of movements of body parts and are a mean of communication that conveys information or intentions to an observer. Therefore, they can be effectively used in human-robot interaction, or in general in human-machine interaction, as a way for a robot or a machine to infer a meaning. In order for people to intuitively use gestures and understand robot gestures, it is necessary to define mappings between gestures and their associated meanings -- a gesture vocabulary. Human gesture vocabulary defines which gestures a group of people would intuitively use to convey information, while robot gesture vocabulary displays which robot gestures are deemed as fitting for a particular meaning. Effective use of vocabularies depends on techniques for gesture recognition, which considers classification of body motion into discrete gesture classes, relying on pattern recognition and machine learning. This thesis addresses both research areas, presenting development of gesture vocabularies as well as gesture recognition techniques, focusing on hand and arm gestures. Attentional models for humanoid robots were developed as a prerequisite for human-robot interaction and a precursor to gesture recognition. A method for defining gesture vocabularies for humans and robots, based on user observations and surveys, is explained and experimental results are presented. As a result of the robot gesture vocabulary experiment, an evolutionary-based approach for refinement of robot gestures is introduced, based on interactive genetic algorithms. A robust and well-performing gesture recognition algorithm based on dynamic time warping has been developed. Most importantly, it employs one-shot learning, meaning that it can be trained using a low number of training samples and employed in real-life scenarios, lowering the effect of environmental constraints and gesture features. Finally, an approach for learning a relation between self-motion and pointing gestures is presented

    DeepDynamicHand: A Deep Neural Architecture for Labeling Hand Manipulation Strategies in Video Sources Exploiting Temporal Information

    Get PDF
    Humans are capable of complex manipulation interactions with the environment, relying on the intrinsic adaptability and compliance of their hands. Recently, soft robotic manipulation has attempted to reproduce such an extraordinary behavior, through the design of deformable yet robust end-effectors. To this goal, the investigation of human behavior has become crucial to correctly inform technological developments of robotic hands that can successfully exploit environmental constraint as humans actually do. Among the different tools robotics can leverage on to achieve this objective, deep learning has emerged as a promising approach for the study and then the implementation of neuro-scientific observations on the artificial side. However, current approaches tend to neglect the dynamic nature of hand pose recognition problems, limiting the effectiveness of these techniques in identifying sequences of manipulation primitives underpinning action generation, e.g., during purposeful interaction with the environment. In this work, we propose a vision-based supervised Hand Pose Recognition method which, for the first time, takes into account temporal information to identify meaningful sequences of actions in grasping and manipulation tasks. More specifically, we apply Deep Neural Networks to automatically learn features from hand posture images that consist of frames extracted from grasping and manipulation task videos with objects and external environmental constraints. For training purposes, videos are divided into intervals, each associated to a specific action by a human supervisor. The proposed algorithm combines a Convolutional Neural Network to detect the hand within each video frame and a Recurrent Neural Network to predict the hand action in the current frame, while taking into consideration the history of actions performed in the previous frames. Experimental validation has been performed on two datasets of dynamic hand-centric strategies, where subjects regularly interact with objects and environment. Proposed architecture achieved a very good classification accuracy on both datasets, reaching performance up to 94%, and outperforming state of the art techniques. The outcomes of this study can be successfully applied to robotics, e.g., for planning and control of soft anthropomorphic manipulators

    Sublimate: State-Changing Virtual and Physical Rendering to Augment Interaction with Shape Displays

    Get PDF
    Recent research in 3D user interfaces pushes towards immersive graphics and actuated shape displays. Our work explores the hybrid of these directions, and we introduce sublimation and deposition, as metaphors for the transitions between physical and virtual states. We discuss how digital models, handles and controls can be interacted with as virtual 3D graphics or dynamic physical shapes, and how user interfaces can rapidly and fluidly switch between those representations. To explore this space, we developed two systems that integrate actuated shape displays and augmented reality (AR) for co-located physical shapes and 3D graphics. Our spatial optical see-through display provides a single user with head-tracked stereoscopic augmentation, whereas our handheld devices enable multi-user interaction through video seethrough AR. We describe interaction techniques and applications that explore 3D interaction for these new modalities. We conclude by discussing the results from a user study that show how freehand interaction with physical shape displays and co-located graphics can outperform wand-based interaction with virtual 3D graphics.National Science Foundation (U.S.) (Graduate Research Fellowship Grant 1122374

    Comparison of interaction modalities for mobile indoor robot guidance : direct physical interaction, person following, and pointing control

    Get PDF
    © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksThree advanced natural interaction modalities for mobile robot guidance in an indoor environment were developed and compared using two tasks and quantitative metrics to measure performance and workload. The first interaction modality is based on direct physical interaction requiring the human user to push the robot in order to displace it. The second and third interaction modalities exploit a 3-D vision-based human-skeleton tracking allowing the user to guide the robot by either walking in front of it or by pointing toward a desired location. In the first task, the participants were asked to guide the robot between different rooms in a simulated physical apartment requiring rough movement of the robot through designated areas. The second task evaluated robot guidance in the same environment through a set of waypoints, which required accurate movements. The three interaction modalities were implemented on a generic differential drive mobile platform equipped with a pan-tilt system and a Kinect camera. Task completion time and accuracy were used as metrics to assess the users’ performance, while the NASA-TLX questionnaire was used to evaluate the users’ workload. A study with 24 participants indicated that choice of interaction modality had significant effect on completion time (F(2,61)=84.874, p<0.001), accuracy (F(2,29)=4.937, p=0.016), and workload (F(2,68)=11.948, p<0.001). The direct physical interaction required less time, provided more accuracy and less workload than the two contactless interaction modalities. Between the two contactless interaction modalities, the person-following interaction mod- lity was systematically better than the pointing-control one: The participants completed the tasks faster with less workloadPeer ReviewedPostprint (author's final draft

    Automatic Video-based Analysis of Human Motion

    Get PDF

    Laban Movement Analysis Using a Bayesian Model and Perspective Projections

    Get PDF
    Human body movements are meant to move a, or some, body parts to a specific location along a certain trajectory. A person observing the movement might be able to recognize it through the spatial pathway alone. Kendon (Kendon, 2004) holds the view that willingly or not, humans, when in co-presence, continuously inform one another about their intentions
    corecore