7 research outputs found

    A Visual Language for Robot Control and Programming: A Human-Interface Study

    Full text link
    Abstract—We describe an interaction paradigm for con-trolling a robot using hand gestures. In particular, we are interested in the control of an underwater robot by an on-site human operator. Under this context, vision-based con-trol is very attractive, and we propose a robot control and programming mechanism based on visual symbols. A human operator presents engineered visual targets to the robotic system, which recognizes and interprets them. This paper describes the approach and proposes a specific gesture language called “RoboChat”. RoboChat allows an operator to control a robot and even express complex programming concepts, using a sequence of visually presented symbols, encoded into fiducial markers. We evaluate the efficiency and robustness of this symbolic communication scheme by comparing it to traditional gesture-based interaction involving a remote human operator. I

    Sign language perception research for improving automatic sign language recognition

    Full text link

    Hand Gesture Recognition within a Linguistics-Based Framework

    No full text
    An approach to recognizing human hand gestures from a monocular temporal sequence of images is presented. Of particular concern is the representation and recognition of hand movements that are used in single handed American Sign Language (ASL). The approach exploits previous linguistic analysis of manual languages that decompose dynamic gestures into their static and dynamic components. The first level of decomposition is in terms of three sets of primitives, hand shape, location and movement. Further levels of decomposition involve the lexical and sentence levels and are part of our plan for future work. We propose and subsequently demonstrate that given a monocular gesture sequence, kinematic features can be recovered from the apparent motion that provide distinctive signatures for 14 primitive movements of ASL. The approach has been implemented in software and evaluated on a database of 592 gesture sequences with an overall recognition rate of 86.00% for fully automated processing and 97.13% for manually initialized processing

    Traitement automatique de vidéos en LSF. Modélisation et exploitation des contraintes phonologiques du mouvement

    Get PDF
    Dans le domaine du Traitement automatique des langues naturelles, l'exploitation d'énoncés en langues des signes occupe une place à part. En raison des spécificités propres à la Langue des Signes Française (LSF) comme la simultanéité de plusieurs paramètres, le fort rôle de l'expression du visage, le recours massif à des unités gestuelles iconiques et l'utilisation de l'espace pour structurer l'énoncé, de nouvelles méthodes de traitement doivent êtres adaptées à cette langue. Nous exposons d'abord une méthode de suivi basée sur un filtre particulaire, permettant de déterminer à tout moment la position de la tête, des coudes, du buste et des mains d'un signeur dans une vidéo monovue. Cette méthode a été adaptée à la LSF pour la rendre plus robuste aux occultations, aux sorties de cadre et aux inversions des mains du signeur. Ensuite, l'analyse de données issues de capture de mouvements nous permet d'aboutir à une catégorisation de différents mouvements fréquemment utilisés dans la production de signes. Nous en proposons un modèle paramétrique que nous utilisons dans le cadre de la recherche de signes dans une vidéo, à partir d'un exemple vidéo de signe. Ces modèles de mouvement sont enfin réutilisés dans des applications permettant d'assister un utilisateur dans la création d'images de signe et la segmentation d'une vidéo en signes.There are a lot of differences between sign languages and vocal languages. Among them, we can underline the simultaneity of several parameters, the important role of the face expression, the recurrent use of iconic gestures and the use of signing space to structure utterances. As a consequence, new methods have to be developed and adapted to those languages. At first, we detail a method based on a particle filter to estimate at any time, the position of the signer's head, hands, elbows and shoulders in a monoview video. This method has been adapted to the French Sign Language in order to make it more robust to occlusion, inversion of the signer's hands or disappearance of hands from the video frame. Then, we propose a classification of the motion patterns that are frequently involved in the sign of production, thanks to the analysis of motion capture data. The parametric models associated to each sign pattern are used in the frame of automatic signe retrieval in a video from a filmed sign example. We finally include those models in two applications. The first one helps an user in creating sign pictures. The second one is dedicated to computer aided sign segmentation

    Self-adaptive structure semi-supervised methods for streamed emblematic gestures

    Get PDF
    Although many researchers try to improve the level of machine intelligence, there is still a long way to achieve intelligence similar to what humans have. Scientists and engineers are continuously trying to increase the level of smartness of the modern technology, i.e. smartphones and robotics. Humans communicate with each other by using the voice and gestures. Hence, gestures are essential to transfer the information to the partner. To reach a higher level of intelligence, the machine should learn from and react to the human gestures, which mean learning from continuously streamed gestures. This task faces serious challenges since processing streamed data suffers from different problems. Besides the stream data being unlabelled, the stream is long. Furthermore, “concept-drift” and “concept evolution” are the main problems of them. The data of the data streams have several other problems that are worth to be mentioned here, e.g. they are: dynamically changed, presented only once, arrived at high speed, and non-linearly distributed. In addition to the general problems of the data streams, gestures have additional problems. For example, different techniques are required to handle the varieties of gesture types. The available methods solve some of these problems individually, while we present a technique to solve these problems altogether. Unlabelled data may have additional information that describes the labelled data more precisely. Hence, semi-supervised learning is used to handle the labelled and unlabelled data. However, the data size increases continuously, which makes training classifiers so hard. Hence, we integrate the incremental learning technique with semi-supervised learning, which enables the model to update itself on new data without the need of the old data. Additionally, we integrate the incremental class learning within the semi-supervised learning, since there is a high possibility of incoming new concepts in the streamed gestures. Moreover, the system should be able to distinguish among different concepts and also should be able to identify random movements. Hence, we integrate the novelty detection to distinguish between the gestures that belong to the known concepts and those that belong to unknown concepts. The extreme value theory is used for this purpose, which overrides the need of additional labelled data to set the novelty threshold and has several other supportive features. Clustering algorithms are used to distinguish among different new concepts and also to identify random movements. Furthermore, the system should be able to update itself on only the trusty assignments, since updating the classifier on wrongly assigned gesture affects the performance of the system. Hence, we propose confidence measures for the assigned labels. We propose six types of semi-supervised algorithms that depend on different techniques to handle different types of gestures. The proposed classifiers are based on the Parzen window classifier, support vector machine classifier, neural network (extreme learning machine), Polynomial classifier, Mahalanobis classifier, and nearest class mean classifier. All of these classifiers are provided with the mentioned features. Additionally, we submit a wrapper method that uses one of the proposed classifiers or ensemble of them to autonomously issue new labels to the new concepts and update the classifiers on the newly incoming information depending on whether they belong to the known classes or new classes. It can recognise the different novel concepts and also identify random movements. To evaluate the system we acquired gesture data with nine different gesture classes. Each of them represents a different order to the machine e.g. come, go, etc. The data are collected using the Microsoft Kinect sensor. The acquired data contain 2878 gestures achieved by ten volunteers. Different sets of features are computed and used in the evaluation of the system. Additionally, we used real data, synthetic data and public data as support to the evaluation process. All the features, incremental learning, incremental class learning, and novelty detection are evaluated individually. The outputs of the classifiers are compared with the original classifier or with the benchmark classifiers. The results show high performances of the proposed algorithms
    corecore