Gait emotion recognition plays a crucial role in the intelligent system. Most
of the existing methods recognize emotions by focusing on local actions over
time. However, they ignore that the effective distances of different emotions
in the time domain are different, and the local actions during walking are
quite similar. Thus, emotions should be represented by global states instead of
indirect local actions. To address these issues, a novel Multi Scale Adaptive
Graph Convolution Network (MSA-GCN) is presented in this work through
constructing dynamic temporal receptive fields and designing multiscale
information aggregation to recognize emotions. In our model, a adaptive
selective spatial-temporal graph convolution is designed to select the
convolution kernel dynamically to obtain the soft spatio-temporal features of
different emotions. Moreover, a Cross-Scale mapping Fusion Mechanism (CSFM) is
designed to construct an adaptive adjacency matrix to enhance information
interaction and reduce redundancy. Compared with previous state-of-the-art
methods, the proposed method achieves the best performance on two public
datasets, improving the mAP by 2\%. We also conduct extensive ablations studies
to show the effectiveness of different components in our methods