562 research outputs found
Real-time human action and gesture recognition using skeleton joints information towards medical applications
Des efforts importants ont été faits pour améliorer la précision de la détection des actions humaines à l’aide des articulations du squelette. Déterminer les actions dans un environnement bruyant reste une tâche difficile, car les coordonnées cartésiennes des articulations du squelette fournies par la caméra de détection à profondeur dépendent de la position de la caméra et de la position du squelette. Dans certaines applications d’interaction homme-machine, la position du squelette et la position de la caméra ne cessent de changer. La méthode proposée recommande d’utiliser des valeurs de position relatives plutôt que des valeurs de coordonnées cartésiennes réelles. Les récents progrès des réseaux de neurones à convolution (RNC) nous aident à obtenir une plus grande précision de prédiction en utilisant des entrées sous forme d’images. Pour représenter les articulations du squelette sous forme d’image, nous devons représenter les informations du squelette sous forme de matrice avec une hauteur et une largeur égale. Le nombre d’articulations du squelette fournit par certaines caméras de détection à profondeur est limité, et nous devons dépendre des valeurs de position relatives pour avoir une représentation matricielle des articulations du squelette. Avec la nouvelle représentation des articulations du squelette et le jeu de données MSR, nous pouvons obtenir des performances semblables à celles de l’état de l’art. Nous avons utilisé le décalage d’image au lieu de l’interpolation entre les images, ce qui nous aide également à obtenir des performances similaires à celle de l’état de l’art.There have been significant efforts in the direction of improving accuracy in detecting human action using skeleton joints. Recognizing human activities in a noisy environment is still challenging since the cartesian coordinate of the skeleton joints provided by depth camera depends on camera position and skeleton position. In a few of the human-computer interaction applications, skeleton position, and camera position keep changing. The proposed method recommends using relative positional values instead of actual cartesian coordinate values. Recent advancements in CNN help us to achieve higher prediction accuracy using input in image format. To represent skeleton joints in image format, we need to represent skeleton information in matrix form with equal height and width. With some depth cameras, the number of skeleton joints provided is limited, and we need to depend on relative positional values to have a matrix representation of skeleton joints. We can show the state-of-the-art prediction accuracy on MSR data with the help of the new representation of skeleton joints. We have used frames shifting instead of interpolation between frames, which helps us achieve state-of-the-art performance
Deep Learning-Based Action Recognition
The classification of human action or behavior patterns is very important for analyzing situations in the field and maintaining social safety. This book focuses on recent research findings on recognizing human action patterns. Technology for the recognition of human action pattern includes the processing technology of human behavior data for learning, technology of expressing feature values ​​of images, technology of extracting spatiotemporal information of images, technology of recognizing human posture, and technology of gesture recognition. Research on these technologies has recently been conducted using general deep learning network modeling of artificial intelligence technology, and excellent research results have been included in this edition
Densely connected GCN model for motion prediction
© 2020 The Authors. Computer Animation and Virtual Worlds published by John Wiley & Sons, Ltd. Human motion prediction is a fundamental problem in understanding human natural movements. This task is very challenging due to the complex human body constraints and diversity of action types. Due to the human body being a natural graph, graph convolutional network (GCN)-based models perform better than the traditional recurrent neural network (RNN)-based models on modeling the natural spatial and temporal dependencies lying in the motion data. In this paper, we develop the GCN-based models further by adding densely connected links to increase their feature utilizations and address oversmoothing problem. More specifically, the GCN block is used to learn the spatial relationships between the nodes and each feature map of the GCN block propagates directly to every following block as input rather than residual linked. In this way, the spatial dependency of human motion data is exploited more sufficiently and the features of different level of scale are fused more efficiently. Extensive experiments demonstrate our model achieving the state-of-the-art results on CMU dataset
Dynamic Dense Graph Convolutional Network for Skeleton-based Human Motion Prediction
Graph Convolutional Networks (GCN) which typically follows a neural message
passing framework to model dependencies among skeletal joints has achieved high
success in skeleton-based human motion prediction task. Nevertheless, how to
construct a graph from a skeleton sequence and how to perform message passing
on the graph are still open problems, which severely affect the performance of
GCN. To solve both problems, this paper presents a Dynamic Dense Graph
Convolutional Network (DD-GCN), which constructs a dense graph and implements
an integrated dynamic message passing. More specifically, we construct a dense
graph with 4D adjacency modeling as a comprehensive representation of motion
sequence at different levels of abstraction. Based on the dense graph, we
propose a dynamic message passing framework that learns dynamically from data
to generate distinctive messages reflecting sample-specific relevance among
nodes in the graph. Extensive experiments on benchmark Human 3.6M and CMU Mocap
datasets verify the effectiveness of our DD-GCN which obviously outperforms
state-of-the-art GCN-based methods, especially when using long-term and our
proposed extremely long-term protocol
D-STGCNT: A Dense Spatio-Temporal Graph Conv-GRU Network based on transformer for assessment of patient physical rehabilitation
This paper tackles the challenge of automatically assessing physical
rehabilitation exercises for patients who perform the exercises without
clinician supervision. The objective is to provide a quality score to ensure
correct performance and achieve desired results. To achieve this goal, a new
graph-based model, the Dense Spatio-Temporal Graph Conv-GRU Network with
Transformer, is introduced. This model combines a modified version of STGCN and
transformer architectures for efficient handling of spatio-temporal data. The
key idea is to consider skeleton data respecting its non-linear structure as a
graph and detecting joints playing the main role in each rehabilitation
exercise. Dense connections and GRU mechanisms are used to rapidly process
large 3D skeleton inputs and effectively model temporal dynamics. The
transformer encoder's attention mechanism focuses on relevant parts of the
input sequence, making it useful for evaluating rehabilitation exercises. The
evaluation of our proposed approach on the KIMORE and UI-PRMD datasets
highlighted its potential, surpassing state-of-the-art methods in terms of
accuracy and computational time. This resulted in faster and more accurate
learning and assessment of rehabilitation exercises. Additionally, our model
provides valuable feedback through qualitative illustrations, effectively
highlighting the significance of joints in specific exercises.Comment: 15 pages, Computers in Biology and Medicine Journa
Attention module-based spatial-temporal graph convolutional networks for skeleton-based action recognition
Skeleton-based action recognition is a significant direction of human action recognition, because the skeleton contains important information for recognizing action. The spatial-temporal graph convolutional networks (ST-GCN) automatically learn both the temporal and spatial features from the skeleton data and achieve remarkable performance for skeleton-based action recognition. However, ST-GCN just learns local information on a certain neighborhood but does not capture the correlation information between all joints (i.e., global information). Therefore, we need to introduce global information into the ST-GCN. We propose a model of dynamic skeletons called attention module-based-ST-GCN, which solves these problems by adding attention module. The attention module can capture some global information, which brings stronger expressive power and generalization capability. Experimental results on two large-scale datasets, Kinetics and NTU-RGB+D, demonstrate that our model achieves significant improvements over previous representative methods. © 2019 SPIE and IS&T
- …