Search CORE

8 research outputs found

Hierarchical long short-term memory for action recognition based on 3D skeleton joints from Kinect sensor

Author: Hidayanto Nur Awal
Prahara Adhi
Puriyanto Riky Dwi
Publication venue: 'Universitas Ahmad Dahlan, Kampus 3'
Publication date: 24/02/2021
Field of study

Action recognition has been used in a wide range of applications such as human-computer interaction, intelligent video surveillance systems, video summarization, and robotics. Recognizing action is important for intelligent agents to understand, learn and interact with the environment. The recent technology that allows the acquisition of RGB+D and 3D skeleton data and a deep learning model's development significantly increases the action recognition model's performance. In this research, hierarchical Long Sort-Term Memory is proposed to recognize action based on 3D skeleton joints from Kinect sensor. The model uses the 3D axis of skeleton joints and groups each joint in the axis into parts, namely, spine, left and right arm, left and right hand, and left and right leg. To fit the hierarchically structured layers of LSTM, the parts are concatenated into spine, arms, hands, and legs and then concatenated into the body. The model crosses the body in each axis into a single final body and fed to the final layer to classify the action. The performance is measured using cross-view and cross-subject evaluation and achieves accuracy 0.854 and 0.837, respectively, from the 10 action classes of the NTU RGB+D dataset

Journal of Education and Learning (EduLearn)

UAD Journal Management System

Two-Stage Human Activity Recognition Using 2D-ConvNet

Author: Chauhan Prachi
Mandoria H L
Singh Brij Mohan
Verma Kamal Kant
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 28/03/2022
Field of study

There is huge requirement of continuous intelligent monitoring system for human activity recognition in various domains like public places, automated teller machines or healthcare sector. Increasing demand of automatic recognition of human activity in these sectors and need to reduce the cost involved in manual surveillance have motivated the research community towards deep learning techniques so that a smart monitoring system for recognition of human activities can be designed and developed. Because of low cost, high resolution and ease of availability of surveillance cameras, the authors developed a new two-stage intelligent framework for detection and recognition of human activity types inside the premises. This paper, introduces a novel framework to recognize single-limb and multi-limb human activities using a Convolution Neural Network. In the first phase single-limb and multi-limb activities are separated. Next, these separated single and multi-limb activities have been recognized using sequence-classification. For training and validation of our framework we have used the UTKinect-Action Dataset having 199 actions sequences performed by 10 users. We have achieved an overall accuracy of 97.88% in real-time recognition of the activity sequences

Re-UNIR

EPAKDOS_Similarity_Hierarchical LSTM_JIFO

Author: Puriyanto Riky Dwi
Publication venue
Publication date
Field of study

Universitas Ahmad Dahlan Repository

3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information

Author: Casillas Pérez David
Fuentes Jiménez David
Losada Gutiérrez Cristina
López Diz Sergio De
Marrón Romera Marta
Sarker Mohammad Ibrahim
Sánchez Caballero Adrián
Publication venue: Springer
Publication date: 19/03/2022
Field of study

This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people"s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNNbased state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.Agencia Estatal de InvestigaciónUniversidad de Alcal

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Deep Learning for Dense Interpretation of Video: Survey of Various Approach, Challenges, Datasets and Metrics

Author: Vijay R. Ghorpade Kiran P Kamble,
Publication venue: Auricle Global Society of Education and Research
Publication date: 01/03/2024
Field of study

Video interpretation has garnered considerable attention in computer vision and natural language processing fields due to the rapid expansion of video data and the increasing demand for various applications such as intelligent video search, automated video subtitling, and assistance for visually impaired individuals. However, video interpretation presents greater challenges due to the inclusion of both temporal and spatial information within the video. While deep learning models for images, text, and audio have made significant progress, efforts have recently been focused on developing deep networks for video interpretation. A thorough evaluation of current research is necessary to provide insights for future endeavors, considering the myriad techniques, datasets, features, and evaluation criteria available in the video domain. This study offers a survey of recent advancements in deep learning for dense video interpretation, addressing various datasets and the challenges they present, as well as key features in video interpretation. Additionally, it provides a comprehensive overview of the latest deep learning models in video interpretation, which have been instrumental in activity identification and video description or captioning. The paper compares the performance of several deep learning models in this field based on specific metrics. Finally, the study summarizes future trends and directions in video interpretation

International Journal on Recent and Innovation Trends in Computing and Communication

Human activity detection and action recognition in videos using convolutional neural networks

Author: Basavaiah Jagadeesh
Patil Chandrashekar Mohan
Publication venue: 'UUM Press, Universiti Utara Malaysia'
Publication date: 01/01/2020
Field of study

Human activity recognition from video scenes has become a significant area of research in the field of computer vision applications. Action recognition is one of the most challenging problems in the area of video analysis and it finds applications in human-computer interaction, anomalous activity detection, crowd monitoring and patient monitoring. Several approaches have been presented for human activity recognition using machine learning techniques. The main aim of this work is to detect and track human activity, and classify actions for two publicly available video databases. In this work, a novel approach of feature extraction from video sequence by combining Scale Invariant Feature Transform and optical flow computation are used where shape, gradient and orientation features are also incorporated for robust feature formulation. Tracking of human activity in the video is implemented using the Gaussian Mixture Model. Convolutional Neural Network based classification approach is used for database training and testing purposes. The activity recognition performance is evaluated for two public datasets namely Weizmann dataset and Kungliga Tekniska Hogskolan dataset with action recognition accuracy of 98.43% and 94.96%, respectively. Experimental and comparative studies have shown that the proposed approach outperformed state-of the art techniques

UUM Repository

Human Action Recognition by Learning Spatio-Temporal Features with Deep Neural Networks

Author: A. Lakshman
Anuj Kumar
P. Chandra Sekhar Reddy
P. Haindavi
Shaik Sharif
Veerender Aerranagula
Publication venue: EDP Sciences
Publication date: 06/10/2023
Field of study

Human action recognition plays a crucial role in various applications, including video surveillance, human-computer interaction, and activity analysis. This paper presents a study on human action recognition by leveraging CNN-LSTM architecture with an attention model. The proposed approach aims to capture both spatial and temporal information from videos in order to recognize human actions. We utilize the UCF-101 and UCF-50 datasets, which are widely used benchmark datasets for action recognition. The UCF-101 dataset consists of 101 action classes, while the UCF-50 dataset comprises 50 action classes, both encompassing diverse human activities. Our CNN-LSTM model integrates a CNN as the feature extractor to capture spatial information from video frames. Subsequently, the extracted features are fed into an LSTM network to capture temporal dependencies and sequence information. To enhance the discriminative power of the model, an attention model is incorporated to improve the activation patterns and highlight relevant features. Furthermore, the study provides insights into the importance of leveraging both spatial and temporal information for accurate action recognition. The findings highlight the efficacy of the CNN-LSTM architecture with an attention model in capturing meaningful patterns in video sequences and improving action recognition accuracy. You should leave 8 mm of space above the abstract and 10 mm after the abstract. The heading Abstract should be typed in bold 9-point Arial. The body of the abstract should be typed in normal 9-point Times in a single paragraph, immediately following the heading. The text should be set to 1 line spacing. The abstract should be centred across the page, indented 17 mm from the left and right page margins and justified. It should not normally exceed 200 words

EDP Sciences OAI-PMH repository (1.2.0)