Human Action Recognition using Multi-Kernel Learning for Temporal Residual Network

Abstract

This paper has been presented at the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.Deep learning has led to a series of breakthrough in the human action recognition field. Given the powerful representational ability of residual networks (ResNet), performance in many computer vision tasks including human action recognition has improved. Motivated by the success of ResNet, we use the residual network and its variations to obtain feature representation. Bearing in mind the importance of appearance and motion information for action representation, our network utilizes both for feature extraction. Appearance and motion features are further fused for action classification using a multi-kernel support vector machine (SVM).We also investigate the fusion of dense trajectories with the proposed network to boost up the network performance. We evaluate our proposed methods on a benchmark dataset (HMDB-51) and results shows the multi-kernel learning shows the better performance than the fusion of classification score from deep network SoftMax layer. Our proposed method also shows good performance as compared to the recent state-of-the-art methods.Sergio A. Velastin has received funding from the Universidad Carlos III de Madrid, the European Unions Seventh Framework Programme for research, technological development and demonstration under grant agreement n◦ 600371, el Ministerio de Economía, Industria y Competitividad (COFUND2013-51509) el Ministerio de Educación, cultura y Deporte (CEI-15-17) and Banco Santander. Authors also acknowledge support from the Higher Education Commission, Pakistan

    Similar works