47,100 research outputs found

    Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

    Get PDF
    Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel Multiple View Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM) formulation combined with appearance information. Multiple stream 3D Convolutional Neural Networks (CNNs) are trained on the different views and time resolutions of the region adaptive Depth Motion Maps. Multiple views are synthesised to enhance the view invariance. The region adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information (RGB) are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multiple class Support Vector Machines (SVM)s. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human-object interaction. Three public domain datasets including: MSR 3D Action,Northwestern UCLA multi-view actions and MSR 3D daily activity are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.Comment: 14 pages, 6 figures, 13 tables. Submitte

    Structure fusion based on graph convolutional networks for semi-supervised classification

    Full text link
    Suffering from the multi-view data diversity and complexity for semi-supervised classification, most of existing graph convolutional networks focus on the networks architecture construction or the salient graph structure preservation, and ignore the the complete graph structure for semi-supervised classification contribution. To mine the more complete distribution structure from multi-view data with the consideration of the specificity and the commonality, we propose structure fusion based on graph convolutional networks (SF-GCN) for improving the performance of semi-supervised classification. SF-GCN can not only retain the special characteristic of each view data by spectral embedding, but also capture the common style of multi-view data by distance metric between multi-graph structures. Suppose the linear relationship between multi-graph structures, we can construct the optimization function of structure fusion model by balancing the specificity loss and the commonality loss. By solving this function, we can simultaneously obtain the fusion spectral embedding from the multi-view data and the fusion structure as adjacent matrix to input graph convolutional networks for semi-supervised classification. Experiments demonstrate that the performance of SF-GCN outperforms that of the state of the arts on three challenging datasets, which are Cora,Citeseer and Pubmed in citation networks
    • …
    corecore