15 research outputs found

    Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks

    Full text link
    This paper addresses the problem of continuous gesture recognition from sequences of depth maps using convolutional neutral networks (ConvNets). The proposed method first segments individual gestures from a depth sequence based on quantity of movement (QOM). For each segmented gesture, an Improved Depth Motion Map (IDMM), which converts the depth sequence into one image, is constructed and fed to a ConvNet for recognition. The IDMM effectively encodes both spatial and temporal information and allows the fine-tuning with existing ConvNet models for classification without introducing millions of parameters to learn. The proposed method is evaluated on the Large-scale Continuous Gesture Recognition of the ChaLearn Looking at People (LAP) challenge 2016. It achieved the performance of 0.2655 (Mean Jaccard Index) and ranked 3rd3^{rd} place in this challenge

    Human gesture classification by brute-force machine learning for exergaming in physiotherapy

    Get PDF
    In this paper, a novel approach for human gesture classification on skeletal data is proposed for the application of exergaming in physiotherapy. Unlike existing methods, we propose to use a general classifier like Random Forests to recognize dynamic gestures. The temporal dimension is handled afterwards by majority voting in a sliding window over the consecutive predictions of the classifier. The gestures can have partially similar postures, such that the classifier will decide on the dissimilar postures. This brute-force classification strategy is permitted, because dynamic human gestures show sufficient dissimilar postures. Online continuous human gesture recognition can classify dynamic gestures in an early stage, which is a crucial advantage when controlling a game by automatic gesture recognition. Also, ground truth can be easily obtained, since all postures in a gesture get the same label, without any discretization into consecutive postures. This way, new gestures can be easily added, which is advantageous in adaptive game development. We evaluate our strategy by a leave-one-subject-out cross-validation on a self-captured stealth game gesture dataset and the publicly available Microsoft Research Cambridge-12 Kinect (MSRC-12) dataset. On the first dataset we achieve an excellent accuracy rate of 96.72%. Furthermore, we show that Random Forests perform better than Support Vector Machines. On the second dataset we achieve an accuracy rate of 98.37%, which is on average 3.57% better then existing methods

    EmbodiMentor: a science fiction prototype to embody different perspectives using augmented reality

    Get PDF
    Conferência realizada na UTAD, Vila Real, de 1-3 de dezembro de 2016This paper describes the EmbodiMentor, an interaction concept and metaphor that aims to enable users to embody a different person or character’s perspective, specify or modify his/her/its emotional elements and conditioning elements, and experience the resulting changes. Its use case scenario is the education and training of foreign languages and intercultural communication skills, were contextualization and first person experiences in common settings are key for practical skill acquisitions. It was born as the micro-science-fiction prototype “Frances can’t sleep. She crawls out of bed and with her EmbodiMentor runs through a range of a client’s emotional states, pitching to each one. She then falls asleep.” The application of the science fiction prototyping concept has been proven a strong approach to develop and investigate innovative applications of emerging technologies.info:eu-repo/semantics/publishedVersio

    GESTURE RECOGNITION FOR PENCAK SILAT TAPAK SUCI REAL-TIME ANIMATION

    Get PDF
    The main target in this research is a design of a virtual martial arts training system in real-time and as a tool in learning martial arts independently using genetic algorithm methods and dynamic time warping. In this paper, it is still in the initial stages, which is focused on taking data sets of martial arts warriors using 3D animation and the Kinect sensor cameras, there are 2 warriors x 8 moves x 596 cases/gesture = 9,536 cases. Gesture Recognition Studies are usually distinguished: body gesture and hand and arm gesture, head and face gesture, and, all three can be studied simultaneously in martial arts pencak silat, using martial arts stance detection with scoring methods. Silat movement data is recorded in the form of oni files using the OpenNI ™ (OFW) framework and BVH (Bio Vision Hierarchical) files as well as plug-in support software on Mocap devices. Responsiveness is a measure of time responding to interruptions, and is critical because the system must be able to meet the demand

    Depth Pooling Based Large-scale 3D Action Recognition with Convolutional Neural Networks

    Get PDF
    This paper proposes three simple, compact yet effective representations of depth sequences, referred to respectively as Dynamic Depth Images (DDI), Dynamic Depth Normal Images (DDNI) and Dynamic Depth Motion Normal Images (DDMNI), for both isolated and continuous action recognition. These dynamic images are constructed from a segmented sequence of depth maps using hierarchical bidirectional rank pooling to effectively capture the spatial-temporal information. Specifically, DDI exploits the dynamics of postures over time and DDNI and DDMNI exploit the 3D structural information captured by depth maps. Upon the proposed representations, a ConvNet based method is developed for action recognition. The image-based representations enable us to fine-tune the existing Convolutional Neural Network (ConvNet) models trained on image data without training a large number of parameters from scratch. The proposed method achieved the state-of-art results on three large datasets, namely, the Large-scale Continuous Gesture Recognition Dataset (means Jaccard index 0.4109), the Large-scale Isolated Gesture Recognition Dataset (59.21%), and the NTU RGB+D Dataset (87.08% cross-subject and 84.22% cross-view) even though only the depth modality was used.Comment: arXiv admin note: text overlap with arXiv:1701.01814, arXiv:1608.0633

    Une étude sur la prise en compte simultanée de deux modalités pour la reconnaissance de gestes de SoundPainting

    Get PDF
    National audienceNowadays, gestures are being adopted as a new modality in the field of Human-Computer Interaction (HMI), where the physical movements of the whole body can perform unlimited actions. Soundpainting is a language of artistic composition used for more than forty years. However, the work on the recognition of SoundPainting gestures is limited and they do not take into account the movements of the fingers and the hand in the gestures which constitute an essential part of SoundPainting. In this context, we con- ducted a study to explore the combination of 3D postures and muscle activity for the recognition of SoundPainting gestures. In order to carry out this study, we created a Sound- Painting database of 17 gestures with data from two sensors (Kinect and Myo). We formulated four hypotheses concerning the accuracy of recognition. The results allowed to characterize the best sensor according to the typology of the gesture, to show that a "simple" combination of the two sensors does not necessarily improves the recognition, that a combination of features is not necessarily more efficient than taking into account a single well-chosen feature, finally, that changing the frequency of the data acquisition provided by these sensors does not have a significant impact on the recognition of gestures.Actuellement, les gestes sont adoptés comme une nouvelle modalité dans le domaine de l'interaction homme-machine, où les mouvements physiques de tout le corps peuvent effectuer des actions quasi-illimitées. Le Soundpainting est un langage de composition artistique utilisé depuis plus de quarante ans. Pourtant, les travaux sur la reconnaissance des gestes SoundPainting sont limités et ils ne prennent pas en compte les mouvements des doigts et de la main dans les gestes qui constituent une partie essentielle de SoundPainting. Dans ce contexte, nous avons réalisé une étude pour explorer la combinaison de postures 3D et de l'activité musculaire pour la reconnaissance des gestes SoundPainting. Pour réaliser cette étude, nous avons créé une base de données SoundPainting de 17 gestes avec les données provenant de deux capteurs (Kinect et Myo). Nous avons formulé quatre hypothèses portant sur la précision de la reconnaissance. Les résultats ont permis de caractériser le meilleur capteur en fonction de la typologie du geste, de montrer qu'une "simple" combinaison des deux capteurs n'entraîne pas forcément une amélioration de la reconnaissance, de même une combinaisons de caractéristiques n'est pas forcément plus performante que la prise en compte d'une seule caractéristique bien choisie, enfin, que le changement de la cadence d'acquisition des données fournies par ces capteurs n'a pas un impact significatif sur la reconnaissance des gestes

    Human Motion Recognition: A Geometric Approach Using Deep Learning

    Get PDF
    Οι πρόσφατες καινοτομίες στον τομέα του υλικού υπολογιστών και ο συνεχόμενα αυξανόμενος όγκος δεδομένων εικόνας και ήχου έχουν αναδείξει την περιοχή της Υπολογιστικής Όρασης σε ένα σημαντικό πεδίο της ευρύτερης περιοχής της επιστήμης των υπολογιστών. Σε αυτό το πλαίσιο έχουν διαμορφωθεί ιδιαίτερα επωφελείς συνθήκες για την αναγνώριση ανθρώπινης κίνησης από βίντεο, που αποτελεί ένα ερευνητικό πεδίο σχετιζόμενο τόσο με την Υπολογιστική Όραση όσο και με την αναγνώριση προτύπων. Αυτό το πεδίο, που υποστηρίζεται από την πρόσφατη εμφάνιση απαιτητικών, πολυδιάστατων συνόλων δεδομένων ανθρώπινης κίνησης μεγάλης κλίμακας, παρέχει αρκετές δυνατότητες για την επίλυση πρακτικών προβλημάτων. Οι λύσεις που προκύπτουν μπορούν να βρουν εφαρμογή σε διάφορους τομείς της ζωής, όπως όπως η υγειονομική φροντίδα ή η επικοινωνία ανθρώπου μηχανής. Αντικείμενο της παρούσας εργασίας είναι η υλοποίηση μίας πρωτότυπης μεθόδου που θα επιτρέπει την αναγνώριση ανθρώπινων κινήσεων μέσα από ένα πλήθος απαιτητικών σεναρίων. Για να επιτύχουμε αυτό το στόχο αξιοποιούμε τρισδιάστατα χωροχρονικά δεδομένα ανθρώπινων αρθρώσεων, τα οποία επεξεργαζόμαστε χρησιμοποιώντας μετασχηματισμούς εικόνας και γεωμετρικούς μετασχηματισμούς, καθώς και μία αρχιτεκτονική νευρωνικών δικτύων βαθιάς μάθησης για την ταξινόμηση των κινήσεων.The ongoing advances in computer hardware and the ever growing volume of image and video data have made Computer Vision a prosperous field of computer science. In this context, a research area adjacent to both Computer Vision and pattern recognition, namely human motion recognition from video sequences reaps many benefits. This area of research, has been backed up by the recent introduction of challenging, large scale, human action multimodal datasets, offering many opportunities for solving practical problems. Such solutions can be deployed in several aspects of our daily life, such as healthcare or social interaction. In this thesis, our goal is to provide a method which is able to recognize human actions in a multitude of demanding scenarios. To achieve this we utilize spatio-temporal 3-D human joint data, processed with a set of image and geometric transformations, as well as a deep learning neural network architecture for motion classification

    Progettazione e sviluppo di un framework per il riconoscimento real-time di gesture su Kinect

    Get PDF
    Il gesto è uno degli strumenti di comunicazione più ancestrali e immediati che un essere umano possa utilizzare. Esso permette infatti di veicolare messaggi anche molto complessi in una frazione di secondo e garantisce ad entrambi gli interlocutori una comprensione ottimale. Per questo motivo esso è da sempre al centro di molte ricerche nell'ambito della computer science; lo scopo di queste ricerche è essenzialmente quello di garantire all'uomo un'interazione più naturale con un essere che di sua natura molto socievole non è, il computer.\\ Insegnare a un'ogetto, pressochè incapace di pensare, le basi della comunicazione non verbale, è un compito davvero complicato; basti pensare che ai nostri giorni esistono davvero pochissimi casi in cui un calcolatore sia in grado di comprendere e riconoscere un'emozione attraverso segnali non verbali. Per questo motivo molte persone rimangono scettiche sull'effettiva utilità e scalabilità di queste tecnologie nel lungo periodo.\\ E' innegabile però che in un momento storico come quello attuale in cui la tecnologia e il progresso fanno balzi da gigante (introducendo in sordina continue rivoluzioni come l'\textit{industria 4.0}), la società si stia indirizzando verso un mondo sempre più automatizzato. Inoltre vedendo gli ultimi risultati ottenuti con l'utilizzo di \textit{reti neurali} affiancate all'enorme quantitativo di dati a disposizione (\textit{big data}), il passo per raggiungere le perfezione in questo campo è breve
    corecore