80,154 research outputs found

    Gesture recognition of sign language alphabet with a convolutional neural network using a magnetic positioning system

    Get PDF
    Gesture recognition is a fundamental step to enable efficient communication for the deaf through the automated translation of sign language. This work proposes the usage of a high-precision magnetic positioning system for 3D positioning and orientation tracking of the fingers and hands palm. The gesture is reconstructed by the MagIK (magnetic and inverse kinematics) method and then processed by a deep learning gesture classification model trained to recognize the gestures associated with the sign language alphabet. Results confirm the limits of vision-based systems and show that the proposed method based on hand skeleton reconstruction has good generalization properties. The proposed system, which combines sensor-based gesture acquisition and deep learning techniques for gesture recognition, provides a 100% classification accuracy, signer independent, after a few hours of training using transfer learning technique on well-known ResNet CNN architecture. The proposed classification model training method can be applied to other sensor-based gesture tracking systems and other applications, regardless of the specific data acquisition technology.</p

    Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition

    Get PDF
    This paper describes a novel method called Deep Dynamic Neural Networks (DDNN) for multimodal gesture recognition. A semi-supervised hierarchical dynamic framework based on a Hidden Markov Model (HMM) is proposed for simultaneous gesture segmentation and recognition where skeleton joint information, depth and RGB images, are the multimodal input observations. Unlike most traditional approaches that rely on the construction of complex handcrafted features, our approach learns high-level spatiotemporal representations using deep neural networks suited to the input modality: a Gaussian-Bernouilli Deep Belief Network (DBN) to handle skeletal dynamics, and a 3D Convolutional Neural Network (3DCNN) to manage and fuse batches of depth and RGB images. This is achieved through the modeling and learning of the emission probabilities of the HMM required to infer the gesture sequence. This purely data driven approach achieves a Jaccard index score of 0.81 in the ChaLearn LAP gesture spotting challenge. The performance is on par with a variety of state-of-the-art hand-tuned feature-based approaches and other learning-based methods, therefore opening the door to the use of deep learning techniques in order to further explore multimodal time series data

    Towards a high accuracy wearable hand gesture recognition system using EIT

    Get PDF
    This paper presents a high accuracy hand gesture recognition system based on electrical impedance tomography (EIT). The system interfaces the forearm using a wrist wrap with embedded electrodes. It measures the inner conductivity distributions caused by bone and muscle movement of the forearm in real-time and passes the data to a deep learning neural network for gesture recognition. The system has an EIT bandwidth of 500 kHz and a measured sensitivity in excess of 6.4 Ω per frame. Nineteen hand gestures are designed for recognition, and with the proposed round robin sub-grouping method, an accuracy of over 98% is achieved

    ModDrop: adaptive multi-modal gesture recognition

    Full text link
    We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

    Agile gesture recognition for capacitive sensing devices: adapting on-the-job

    Get PDF
    Automated hand gesture recognition has been a focus of the AI community for decades. Traditionally, work in this domain revolved largely around scenarios assuming the availability of the flow of images of the user hands. This has partly been due to the prevalence of camera-based devices and the wide availability of image data. However, there is growing demand for gesture recognition technology that can be implemented on low-power devices using limited sensor data instead of high-dimensional inputs like hand images. In this work, we demonstrate a hand gesture recognition system and method that uses signals from capacitive sensors embedded into the etee hand controller. The controller generates real-time signals from each of the wearer five fingers. We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms. The analysis is composed of a two stage training strategy, including dimension reduction through principal component analysis and classification with K nearest neighbour. Remarkably, we found that this combination showed a level of performance which was comparable to more advanced methods such as supervised variational autoencoder. The base system can also be equipped with the capability to learn from occasional errors by providing it with an additional adaptive error correction mechanism. The results showed that the error corrector improve the classification performance in the base system without compromising its performance. The system requires no more than 1 ms of computing time per input sample, and is smaller than deep neural networks, demonstrating the feasibility of agile gesture recognition systems based on this technology.Depto. de Análisis Matemático y Matemática AplicadaFac. de Ciencias MatemáticasFALSEunpu

    Efficient hand gesture recognition for human-robot interaction

    Get PDF
    In this paper, we present an efficient and reliable deep-learning approach that allows users to communicate with robots via hand gesture recognition. Contrary to other works which use external devices such as gloves [1] or joysticks [2] to tele-operate robots, the proposed approach uses only visual information to recognize user's instructions that are encoded in a set of pre-defined hand gestures. Particularly, the method consists of two modules which work sequentially to extract 2D landmarks of hands –ie. joints positions– and to predict the hand gesture based on a temporal representation of them. The approach has been validated in a recent state-of-the-art dataset where it outperformed other methods that use multiple pre-processing steps such as optical flow and semantic segmentation. Our method achieves an accuracy of 87.5% and runs at 10 frames per second. Finally, we conducted real-life experiments with our IVO robot to validate the framework during the interaction process.Peer ReviewedPostprint (published version

    Highly-Optimized Radar-Based Gesture Recognition System with Depthwise Expansion Module

    Get PDF
    The increasing integration of technology in our daily lives demands the development of more convenient human–computer interaction (HCI) methods. Most of the current hand-based HCI strategies exhibit various limitations, e.g., sensibility to variable lighting conditions and limitations on the operating environment. Further, the deployment of such systems is often not performed in resource-constrained contexts. Inspired by the MobileNetV1 deep learning network, this paper presents a novel hand gesture recognition system based on frequency-modulated continuous wave (FMCW) radar, exhibiting a higher recognition accuracy in comparison to the state-of-the-art systems. First of all, the paper introduces a method to simplify radar preprocessing while preserving the main information of the performed gestures. Then, a deep neural classifier with the novel Depthwise Expansion Module based on the depthwise separable convolutions is presented. The introduced classifier is optimized and deployed on the Coral Edge TPU board. The system defines and adopts eight different hand gestures performed by five users, offering a classification accuracy of 98.13% while operating in a low-power and resource-constrained environment.Electronic Components and Systems for European Leadership Joint Undertaking under grant agreement No. 826655 (Tempo).European Union’s Horizon 2020 research and innovation programme and Belgium, France, Germany, Switzerland, and the NetherlandsLodz University of Technology

    Multiscale Convolutional Neural Networks for Hand Detection

    Get PDF
    Unconstrained hand detection in still images plays an important role in many hand-related vision problems, for example, hand tracking, gesture analysis, human action recognition and human-machine interaction, and sign language recognition. Although hand detection has been extensively studied for decades, it is still a challenging task with many problems to be tackled. The contributing factors for this complexity include heavy occlusion, low resolution, varying illumination conditions, different hand gestures, and the complex interactions between hands and objects or other hands. In this paper, we propose a multiscale deep learning model for unconstrained hand detection in still images. Deep learning models, and deep convolutional neural networks (CNNs) in particular, have achieved state-of-the-art performances in many vision benchmarks. Developed from the region-based CNN (R-CNN) model, we propose a hand detection scheme based on candidate regions generated by a generic region proposal algorithm, followed by multiscale information fusion from the popular VGG16 model. Two benchmark datasets were applied to validate the proposed method, namely, the Oxford Hand Detection Dataset and the VIVA Hand Detection Challenge. We achieved state-of-the-art results on the Oxford Hand Detection Dataset and had satisfactory performance in the VIVA Hand Detection Challenge.</jats:p

    Few-Shot User-Definable Radar-Based Hand Gesture Recognition at the Edge

    Get PDF
    This work was supported in part by ITEA3 Unleash Potentials in Simulation (UPSIM) by the German Federal Ministry of Education and Research (BMBF) under Project 19006, in part by the Austrian Research Promotion Agency (FFG), in part by the Rijksdienst voor Ondernemend Nederland (Rvo), and in part by the Innovation Fund Denmark (IFD).Technological advances and scalability are leading Human-Computer Interaction (HCI) to evolve towards intuitive forms, such as through gesture recognition. Among the various interaction strategies, radar-based recognition is emerging as a touchless, privacy-secure, and versatile solution in different environmental conditions. Classical radar-based gesture HCI solutions involve deep learning but require training on large and varied datasets to achieve robust prediction. Innovative self-learning algorithms can help tackling this problem by recognizing patterns and adapt from similar contexts. Yet, such approaches are often computationally expensive and hardly integrable into hardware-constrained solutions. In this paper, we present a gesture recognition algorithm which is easily adaptable to new users and contexts. We exploit an optimization-based meta-learning approach to enable gesture recognition in learning sequences. This method targets at learning the best possible initialization of the model parameters, simplifying training on new contexts when small amounts of data are available. The reduction in computational cost is achieved by processing the radar sensed data of gestures in the form of time maps, to minimize the input data size. This approach enables the adaptation of simple convolutional neural network (CNN) to new hand poses, thus easing the integration of the model into a hardware-constrained platform. Moreover, the use of a Variational Autoencoders (VAE) to reduce the gestures' dimensionality leads to a model size decrease of an order of magnitude and to half of the required adaptation time. The proposed framework, deployed on the Intel(R) Neural Compute Stick 2 (NCS 2), leads to an average accuracy of around 84% for unseen gestures when only one example per class is utilized at training time. The accuracy increases up to 92.6% and 94.2% when three and five samples per class are used.Federal Ministry of Education & Research (BMBF) 19006Austrian Research Promotion Agency (FFG)Rijksdienst voor Ondernemend Nederland (Rvo)Innovation Fund Denmark (IFD
    • …
    corecore