1,105 research outputs found
Deep Learning-Based Action Recognition
The classification of human action or behavior patterns is very important for analyzing situations in the field and maintaining social safety. This book focuses on recent research findings on recognizing human action patterns. Technology for the recognition of human action pattern includes the processing technology of human behavior data for learning, technology of expressing feature values ​​of images, technology of extracting spatiotemporal information of images, technology of recognizing human posture, and technology of gesture recognition. Research on these technologies has recently been conducted using general deep learning network modeling of artificial intelligence technology, and excellent research results have been included in this edition
Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos
We propose a unified point cloud video self-supervised learning framework for
object-centric and scene-centric data. Previous methods commonly conduct
representation learning at the clip or frame level and cannot well capture
fine-grained semantics. Instead of contrasting the representations of clips or
frames, in this paper, we propose a unified self-supervised framework by
conducting contrastive learning at the point level. Moreover, we introduce a
new pretext task by achieving semantic alignment of superpoints, which further
facilitates the representations to capture semantic cues at multiple scales. In
addition, due to the high redundancy in the temporal dimension of dynamic point
clouds, directly conducting contrastive learning at the point level usually
leads to massive undesired negatives and insufficient modeling of positive
representations. To remedy this, we propose a selection strategy to retain
proper negatives and make use of high-similarity samples from other instances
as positive supplements. Extensive experiments show that our method outperforms
supervised counterparts on a wide range of downstream tasks and demonstrates
the superior transferability of the learned representations.Comment: Accepted by ICCV 202
Duodepth: Static Gesture Recognition Via Dual Depth Sensors
Static gesture recognition is an effective non-verbal communication channel
between a user and their devices; however many modern methods are sensitive to
the relative pose of the user's hands with respect to the capture device, as
parts of the gesture can become occluded. We present two methodologies for
gesture recognition via synchronized recording from two depth cameras to
alleviate this occlusion problem. One is a more classic approach using
iterative closest point registration to accurately fuse point clouds and a
single PointNet architecture for classification, and the other is a dual
Point-Net architecture for classification without registration. On a manually
collected data-set of 20,100 point clouds we show a 39.2% reduction in
misclassification for the fused point cloud method, and 53.4% for the dual
PointNet, when compared to a standard single camera pipeline.Comment: 26th International Conference on Image Processin
Temporal pyramid Matching of local binary sub-patterns for hand-gesture recognition
Human–computer Interaction systems based on
hand-gesture recognition are nowadays of great interest to establish a natural communication between humans and machines. However, the visual recognition of gestures and other human poses remains a challenging problem. In this paper, the original volumetric spatiograms of local binary patterns descriptor has been extended to efficiently and robustly encode the spatial and temporal
information of hand gestures. This enhancement mitigates the dimensionality problems of the previous approach, and considers more temporal information to achieve a higher recognition rate. Excellent results have been obtained, outperforming other existing approaches of the state of the art
- …