1,485 research outputs found

    Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks

    Full text link
    A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One way of such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for the user, the machine must be capable to establish a common focus of attention and be able to use and integrate spoken instructions, visual perceptions, and non-verbal clues like gestural commands. We report progress in building a hybrid architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by man-machine interaction. The system combines the GRAVIS-robot for visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation, and an modality fusion module to allow multi-modal task-oriented man-machine communication with respect to dextrous robot manipulation of objects.Comment: 7 pages, 8 figure

    Down-Sampling coupled to Elastic Kernel Machines for Efficient Recognition of Isolated Gestures

    Get PDF
    In the field of gestural action recognition, many studies have focused on dimensionality reduction along the spatial axis, to reduce both the variability of gestural sequences expressed in the reduced space, and the computational complexity of their processing. It is noticeable that very few of these methods have explicitly addressed the dimensionality reduction along the time axis. This is however a major issue with regard to the use of elastic distances characterized by a quadratic complexity. To partially fill this apparent gap, we present in this paper an approach based on temporal down-sampling associated to elastic kernel machine learning. We experimentally show, on two data sets that are widely referenced in the domain of human gesture recognition, and very different in terms of quality of motion capture, that it is possible to significantly reduce the number of skeleton frames while maintaining a good recognition rate. The method proves to give satisfactory results at a level currently reached by state-of-the-art methods on these data sets. The computational complexity reduction makes this approach eligible for real-time applications.Comment: ICPR 2014, International Conference on Pattern Recognition, Stockholm : Sweden (2014

    Procedural-Reasoning Architecture for Applied Behavior Analysis-based Instructions

    Get PDF
    Autism Spectrum Disorder (ASD) is a complex developmental disability affecting as many as 1 in every 88 children. While there is no known cure for ASD, there are known behavioral and developmental interventions, based on demonstrated efficacy, that have become the predominant treatments for improving social, adaptive, and behavioral functions in children. Applied Behavioral Analysis (ABA)-based early childhood interventions are evidence based, efficacious therapies for autism that are widely recognized as effective approaches to remediation of the symptoms of ASD. They are, however, labor intensive and consequently often inaccessible at the recommended levels. Recent advancements in socially assistive robotics and applications of virtual intelligent agents have shown that children with ASD accept intelligent agents as effective and often preferred substitutes for human therapists. This research is nascent and highly experimental with no unifying, interdisciplinary, and integral approach to development of intelligent agents based therapies, especially not in the area of behavioral interventions. Motivated by the absence of the unifying framework, we developed a conceptual procedural-reasoning agent architecture (PRA-ABA) that, we propose, could serve as a foundation for ABA-based assistive technologies involving virtual, mixed or embodied agents, including robots. This architecture and related research presented in this disser- tation encompass two main areas: (a) knowledge representation and computational model of the behavioral aspects of ABA as applicable to autism intervention practices, and (b) abstract architecture for multi-modal, agent-mediated implementation of these practices

    I see how you feel: Recipients obtain additional information from speakers’ gestures about pain

    Get PDF
    Objective: Despite the need for effective pain communication, pain is difficult to verbalise. Co-speech gestures frequently add information about pain that is not contained in the accompanying speech. We explored whether recipients can obtain additional information from gestures about the pain that is being described. Methods: Participants (n = 135) viewed clips of pain descriptions under one of four conditions: 1) Speech Only; 2) Speech and Gesture; 3) Speech, Gesture and Face; and 4) Speech, Gesture and Face plus Instruction (short presentation explaining the pain information that gestures can depict). Participants provided free-text descriptions of the pain that had been described. Responses were scored for the amount of information obtained from the original clips. Findings: Participants in the Instruction condition obtained the most information, while those in the Speech Only condition obtained the least (all comparisons p<.001). Conclusions: Gestures produced during pain descriptions provide additional information about pain that recipients are able to pick up without detriment to their uptake of spoken information. Practice implications: Healthcare professionals may benefit from instruction in gestures to enhance uptake of information about patients’ pain experiences

    Getting the Upper Hand: Natural Gesture Interfaces Improve Instructional Efficiency on a Conceptual Computer Lesson

    Get PDF
    As gesture-based interactions with computer interfaces become more technologically feasible for educational and training systems, it is important to consider what interactions are best for the learner. Computer interactions should not interfere with learning nor increase the mental effort of completing the lesson. The purpose of the current set of studies was to determine whether natural gesture-based interactions, or instruction of those gestures, help the learner in a computer lesson by increasing learning and reducing mental effort. First, two studies were conducted to determine what gestures were considered natural by participants. Then, those gestures were implemented in an experiment to compare type of gesture and type of gesture instruction on learning conceptual information from a computer lesson. The goal of these studies was to determine the instructional efficiency – that is, the extent of learning taking into account the amount of mental effort – of implementing gesture-based interactions in a conceptual computer lesson. To test whether the type of gesture interaction affects conceptual learning in a computer lesson, the gesture-based interactions were either naturally- or arbitrarily-mapped to the learning material on the fundamentals of optics. The optics lesson presented conceptual information about reflection and refraction, and participants used the gesture-based interactions during the lesson to manipulate on-screen lenses and mirrors in a beam of light. The beam of light refracted/reflected at the angle corresponding with type of lens/mirror. The natural gesture-based interactions were those that mimicked the physical movement used to manipulate the lenses and mirrors in the optics lesson, while the arbitrary gestures were those that did not match the movement of the lens or mirror being manipulated. The natural gestures implemented in the computer lesson were determined from Study 1, in which participants performed gestures they considered natural for a set of actions, and rated in Study 2 as most closely resembling the physical interaction they represent. The arbitrary gestures were rated by participants as most arbitrary for each computer action in Study 2. To test whether the effect of novel gesture-based interactions depends on how they are taught, the way the gestures were instructed was varied in the main experiment by using either video- or text-based tutorials. Results of the experiment support that natural gesture-based interactions were better for learning than arbitrary gestures, and instruction of the gestures largely did not affect learning and amount of mental effort felt during the task. To further investigate the factors affecting instructional efficiency in using gesture-based interactions for a computer lesson, individual differences of the learner were taken into account. Results indicated that the instructional efficiency of the gestures and their instruction depended on an individual\u27s spatial ability, such that arbitrary gesture interactions taught with a text-based tutorial were particularly inefficient for those with lower spatial ability. These findings are explained in the context of Embodied Cognition and Cognitive Load Theory, and guidelines are provided for instructional design of computer lessons using natural user interfaces. The theoretical frameworks of Embodied Cognition and Cognitive Load Theory were used to explain why gesture-based interactions and their instructions impacted the instructional efficiency of these factors in a computer lesson. Gesture-based interactions that are natural (i.e., mimic the physical interaction by corresponding to the learning material) were more instructionally efficient than arbitrary gestures because natural gestures may help schema development of conceptual information through physical enactment of the learning material. Furthermore, natural gestures resulted in lower cognitive load than arbitrary gestures, because arbitrary gestures that do not match the learning material may increase the working memory processing not associated with the learning material during the lesson. Additionally, the way in which the gesture-based interactions were taught was varied by either instructing the gestures with video- or text-based tutorials, and it was hypothesized that video-based tutorials would be a better way to instruct gesture-based interactions because the videos may help the learner to visualize the interactions and create a more easily recalled sensorimotor representation for the gestures; however, this hypothesis was not supported and there was not strong evidence that video-based tutorials were more instructionally efficient than text-based instructions. The results of the current set of studies can be applied to educational and training systems that incorporate a gesture-based interface. The finding that more natural gestures are better for learning efficiency, cognitive load, and a variety of usability factors should encourage instructional designers and researchers to keep the user in mind when developing gesture-based interactions

    Human gesture classification by brute-force machine learning for exergaming in physiotherapy

    Get PDF
    In this paper, a novel approach for human gesture classification on skeletal data is proposed for the application of exergaming in physiotherapy. Unlike existing methods, we propose to use a general classifier like Random Forests to recognize dynamic gestures. The temporal dimension is handled afterwards by majority voting in a sliding window over the consecutive predictions of the classifier. The gestures can have partially similar postures, such that the classifier will decide on the dissimilar postures. This brute-force classification strategy is permitted, because dynamic human gestures show sufficient dissimilar postures. Online continuous human gesture recognition can classify dynamic gestures in an early stage, which is a crucial advantage when controlling a game by automatic gesture recognition. Also, ground truth can be easily obtained, since all postures in a gesture get the same label, without any discretization into consecutive postures. This way, new gestures can be easily added, which is advantageous in adaptive game development. We evaluate our strategy by a leave-one-subject-out cross-validation on a self-captured stealth game gesture dataset and the publicly available Microsoft Research Cambridge-12 Kinect (MSRC-12) dataset. On the first dataset we achieve an excellent accuracy rate of 96.72%. Furthermore, we show that Random Forests perform better than Support Vector Machines. On the second dataset we achieve an accuracy rate of 98.37%, which is on average 3.57% better then existing methods

    Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep cnn

    Full text link
    This paper presents an image classification based approach for skeleton-based video action recognition problem. Firstly, A dataset independent translation-scale invariant image mapping method is proposed, which transformes the skeleton videos to colour images, named skeleton-images. Secondly, A multi-scale deep convolutional neural network (CNN) architecture is proposed which could be built and fine-tuned on the powerful pre-trained CNNs, e.g., AlexNet, VGGNet, ResNet etal.. Even though the skeleton-images are very different from natural images, the fine-tune strategy still works well. At last, we prove that our method could also work well on 2D skeleton video data. We achieve the state-of-the-art results on the popular benchmard datasets e.g. NTU RGB+D, UTD-MHAD, MSRC-12, and G3D. Especially on the largest and challenge NTU RGB+D, UTD-MHAD, and MSRC-12 dataset, our method outperforms other methods by a large margion, which proves the efficacy of the proposed method
    • …
    corecore