1,726 research outputs found

    Semantic Graph Convolutional Networks for 3D Human Pose Regression

    Full text link
    In this paper, we study the problem of learning Graph Convolutional Networks (GCNs) for regression. Current architectures of GCNs are limited to the small receptive field of convolution filters and shared transformation matrix for each node. To address these limitations, we propose Semantic Graph Convolutional Networks (SemGCN), a novel neural network architecture that operates on regression tasks with graph-structured data. SemGCN learns to capture semantic information such as local and global node relationships, which is not explicitly represented in the graph. These semantic relationships can be learned through end-to-end training from the ground truth without additional supervision or hand-crafted rules. We further investigate applying SemGCN to 3D human pose regression. Our formulation is intuitive and sufficient since both 2D and 3D human poses can be represented as a structured graph encoding the relationships between joints in the skeleton of a human body. We carry out comprehensive studies to validate our method. The results prove that SemGCN outperforms state of the art while using 90% fewer parameters.Comment: In CVPR 2019 (13 pages including supplementary material). The code can be found at https://github.com/garyzhao/SemGC

    Skeleton-based human action and gesture recognition for human-robot collaboration

    Get PDF
    openThe continuous development of robotic and sensing technologies has led in recent years to an increased interest in human-robot collaborative systems, in which humans and robots perform tasks in shared spaces and interact with close and direct contacts. In these scenarios, it is fundamental for the robot to be aware of the behaviour that a person in its proximity has, to ensure their safety and anticipate their actions in performing a shared and collaborative task. To this end, human activity recognition (HAR) techniques have been often applied in human-robot collaboration (HRC) settings. The works in this field usually focus on case-specific applications. Instead, in this thesis we propose a general framework for human action and gesture recognition in a HRC scenario. In particular, a transfer learning enabled skeleton-based approach that employs as backbone the Shift-GCN architecture is used to classify general actions related to HRC scenarios. Pose-based body and hands features are exploited to recognise actions in a way that is independent from the environment in which these are performed and from the tools and objects involved in their execution. The fusion of small network modules, each dedicated to the recognition of either the body or hands movements, is then explored. This allows to better understand the importance of different body parts in the recognition of the actions as well as to improve the classification outcomes. For our experiments, we used the large-scale NTU RGB+D dataset to pre-train the networks. Moreover, a new HAR dataset, named IAS-Lab Collaborative HAR dataset, was collected, containing general actions and gestures related to HRC contexts. On this dataset, our approach reaches a 76.54% accuracy

    Skeletal Human Action Recognition using Hybrid Attention based Graph Convolutional Network

    Full text link
    In skeleton-based action recognition, Graph Convolutional Networks model human skeletal joints as vertices and connect them through an adjacency matrix, which can be seen as a local attention mask. However, in most existing Graph Convolutional Networks, the local attention mask is defined based on natural connections of human skeleton joints and ignores the dynamic relations for example between head, hands and feet joints. In addition, the attention mechanism has been proven effective in Natural Language Processing and image description, which is rarely investigated in existing methods. In this work, we proposed a new adaptive spatial attention layer that extends local attention map to global based on relative distance and relative angle information. Moreover, we design a new initial graph adjacency matrix that connects head, hands and feet, which shows visible improvement in terms of action recognition accuracy. The proposed model is evaluated on two large-scale and challenging datasets in the field of human activities in daily life: NTU-RGB+D and Kinetics skeleton. The results demonstrate that our model has strong performance on both dataset.Comment: 26th International Conference on Pattern Recognition, 202

    Deep Learning on Lie Groups for Skeleton-based Action Recognition

    Full text link
    In recent years, skeleton-based action recognition has become a popular 3D classification problem. State-of-the-art methods typically first represent each motion sequence as a high-dimensional trajectory on a Lie group with an additional dynamic time warping, and then shallowly learn favorable Lie group features. In this paper we incorporate the Lie group structure into a deep network architecture to learn more appropriate Lie group features for 3D action recognition. Within the network structure, we design rotation mapping layers to transform the input Lie group features into desirable ones, which are aligned better in the temporal domain. To reduce the high feature dimensionality, the architecture is equipped with rotation pooling layers for the elements on the Lie group. Furthermore, we propose a logarithm mapping layer to map the resulting manifold data into a tangent space that facilitates the application of regular output layers for the final classification. Evaluations of the proposed network for standard 3D human action recognition datasets clearly demonstrate its superiority over existing shallow Lie group feature learning methods as well as most conventional deep learning methods.Comment: Accepted to CVPR 201
    • …
    corecore