1 research outputs found
Pose-Based Two-Stream Relational Networks for Action Recognition in Videos
Recently, pose-based action recognition has gained more and more attention
due to the better performance compared with traditional appearance-based
methods. However, there still exist two problems to be further solved. First,
existing pose-based methods generally recognize human actions with captured 3D
human poses which are very difficult to obtain in real scenarios. Second, few
pose-based methods model the action-related objects in recognizing human-object
interaction actions in which objects play an important role. To solve the
problems above, we propose a pose-based two-stream relational network (PSRN)
for action recognition. In PSRN, one stream models the temporal dynamics of the
targeted 2D human pose sequences which are directly extracted from raw videos,
and the other stream models the action-related objects from a randomly sampled
video frame. Most importantly, instead of fusing two-streams in the class score
layer as before, we propose a pose-object relational network to model the
relationship between human poses and action-related objects. We evaluate the
proposed PSRN on two challenging benchmarks, i.e., Sub-JHMDB and PennAction.
Experimental results show that our PSRN obtains the state-the-of-art
performance on Sub-JHMDB (80.2%) and PennAction (98.1%). Our work opens a new
door to action recognition by combining 2D human pose extracted from raw video
and image appearance