Search CORE

16,797 research outputs found

Sparsity-Driven Micro-Doppler Feature Extraction for Dynamic Hand Gesture Recognition

Author: Griffiths H
Li G
Ritchie M
Zhang R
Publication venue: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication date: 01/04/2018
Field of study

In this paper, a sparsity-driven method of micro-Doppler analysis is proposed for dynamic hand gesture recognition with radar sensors. First, sparse representations of the echoes reflected from dynamic hand gestures are achieved through the Gaussian-windowed Fourier dictionary. Second, the micro-Doppler features of dynamic hand gestures are extracted using the orthogonal matching pursuit algorithm. Finally, the nearest neighbor classifier is combined with the modified Hausdorff distance to recognize dynamic hand gestures based on the sparse micro-Doppler features. Experiments with real radar data show that the recognition accuracy produced by the proposed method exceeds 96% under moderate noise, and the proposed method outperforms the approaches based on principal component analysis and deep convolutional neural network with small training dataset

UCL Discovery

A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset

Author: Chaaraoui Alexandros André
Flórez-Revuelta Francisco
Padilla-López José Ramón
Publication venue
Publication date: 29/07/2014
Field of study

This paper aims to determine which is the best human action recognition method based on features extracted from RGB-D devices, such as the Microsoft Kinect. A review of all the papers that make reference to MSR Action3D, the most used dataset that includes depth information acquired from a RGB-D device, has been performed. We found that the validation method used by each work differs from the others. So, a direct comparison among works cannot be made. However, almost all the works present their results comparing them without taking into account this issue. Therefore, we present different rankings according to the methodology used for the validation in orden to clarify the existing confusion.Comment: 16 pages and 7 table

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

Effect of sparsity-aware time–frequency analysis on dynamic hand gesture classification with radar micro-Doppler signatures

Author: Fioranelli Francesco
Griffiths Hugh
Li Gang
Zhang Shimeng
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/08/2018
Field of study

Dynamic hand gesture recognition is of great importance in human-computer interaction. In this study, the authors investigate the effect of sparsity-driven time-frequency analysis on hand gesture classification. The time-frequency spectrogram is first obtained by sparsity-driven time-frequency analysis. Then three empirical micro-Doppler features are extracted from the time-frequency spectrogram and a support vector machine is used to classify six kinds of dynamic hand gestures. The experimental results on measured data demonstrate that, compared to traditional time-frequency analysis techniques, sparsity-driven time-frequency analysis provides improved accuracy and robustness in dynamic hand gesture classification

Crossref

Enlighten

ModDrop: adaptive multi-modal gesture recognition

Author: Nebout Florian
Neverova Natalia
Taylor Graham W.
Wolf Christian
Publication venue
Publication date: 06/06/2015
Field of study

We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

HAL

Hal-Diderot

Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions

Author: CM Bishop
D Weinland
DA Bini
F Perronnin
G Csurka
I Traore
J Aggarwal
J Sánchez
K Guo
MT Harandi
MT Harandi
N Aggarwal
P Turaga
R Poppe
S Ali
S Hirose
SR Ke
V Arsigny
Y Wu
Ó Pérez
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against traditional action recognition techniques based on Gaussian mixture models and Fisher vectors (FVs). We evaluate these action recognition techniques under ideal conditions, as well as their sensitivity in more challenging conditions (variations in scale and translation). Despite recent advancements for handling manifolds, manifold based techniques obtain the lowest performance and their kernel representations are more unstable in the presence of challenging conditions. The FV approach obtains the highest accuracy under ideal conditions. Moreover, FV best deals with moderate scale and translation changes

arXiv.org e-Print Archive

Crossref

University of Queensland eSpace