2 research outputs found

    Three-stream 3D/1D CNN for fine-grained action classification and segmentation in table tennis

    Get PDF
    This paper proposes a fusion method of modalities extracted from videothrough a three-stream network with spatio-temporal and temporal convolutionsfor fine-grained action classification in sport. It is applied to TTStroke-21dataset which consists of untrimmed videos of table tennis games. The goal isto detect and classify table tennis strokes in the videos, the first step of abigger scheme aiming at giving feedback to the players for improving theirperformance. The three modalities are raw RGB data, the computed optical flowand the estimated pose of the player. The network consists of three brancheswith attention blocks. Features are fused at the latest stage of the networkusing bilinear layers. Compared to previous approaches, the use of threemodalities allows faster convergence and better performances on both tasks:classification of strokes with known temporal boundaries and joint segmentationand classification. The pose is also further investigated in order to offerricher feedback to the athletes.<br

    3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition

    Full text link
    3D convolutional networks is a good means to perform tasks such as video segmentation into coherent spatio-temporal chunks and classification of them with regard to a target taxonomy. In the chapter we are interested in the classification of continuous video takes with repeatable actions, such as strokes of table tennis. Filmed in a free marker less ecological environment, these videos represent a challenge from both segmentation and classification point of view. The 3D convnets are an efficient tool for solving these problems with window-based approaches.Comment: Multi-faceted Deep Learning, 202
    corecore