1 research outputs found
Temporal Action Detection by Joint Identification-Verification
Temporal action detection aims at not only recognizing action category but
also detecting start time and end time for each action instance in an untrimmed
video. The key challenge of this task is to accurately classify the action and
determine the temporal boundaries of each action instance. In temporal action
detection benchmark: THUMOS 2014, large variations exist in the same action
category while many similarities exist in different action categories, which
always limit the performance of temporal action detection. To address this
problem, we propose to use joint Identification-Verification network to reduce
the intra-action variations and enlarge inter-action differences. The joint
Identification-Verification network is a siamese network based on 3D ConvNets,
which can simultaneously predict the action categories and the similarity
scores for the input pairs of video proposal segments. Extensive experimental
results on the challenging THUMOS 2014 dataset demonstrate the effectiveness of
our proposed method compared to the existing state-of-art methods for temporal
action detection in untrimmed videos