4 research outputs found

    Human action recognition using fusion of depth and inertial sensors

    Get PDF
    In this paper we present a human action recognition system that utilizes the fusion of depth and inertial sensor measurements. Robust depth and inertial signal features, that are subject-invariant, are used to train independent Neural Networks, and later decision level fusion is employed using a probabilistic framework in the form of Logarithmic Opinion Pool. The system is evaluated using UTD-Multimodal Human Action Dataset, and we achieve 95% accuracy in 8-fold cross-validation, which is not only higher than using each sensor separately, but is also better than the best accuracy obtained on the mentioned dataset by 3.5%

    A Novel Two Stream Decision Level Fusion of Vision and Inertial Sensors Data for Automatic Multimodal Human Activity Recognition System

    Full text link
    This paper presents a novel multimodal human activity recognition system. It uses a two-stream decision level fusion of vision and inertial sensors. In the first stream, raw RGB frames are passed to a part affinity field-based pose estimation network to detect the keypoints of the user. These keypoints are then pre-processed and inputted in a sliding window fashion to a specially designed convolutional neural network for the spatial feature extraction followed by regularized LSTMs to calculate the temporal features. The outputs of LSTM networks are then inputted to fully connected layers for classification. In the second stream, data obtained from inertial sensors are pre-processed and inputted to regularized LSTMs for the feature extraction followed by fully connected layers for the classification. At this stage, the SoftMax scores of two streams are then fused using the decision level fusion which gives the final prediction. Extensive experiments are conducted to evaluate the performance. Four multimodal standard benchmark datasets (UP-Fall detection, UTD-MHAD, Berkeley-MHAD, and C-MHAD) are used for experimentations. The accuracies obtained by the proposed system are 96.9 %, 97.6 %, 98.7 %, and 95.9 % respectively on the UP-Fall Detection, UTDMHAD, Berkeley-MHAD, and C-MHAD datasets. These results are far superior than the current state-of-the-art methods
    corecore