2 research outputs found
Non-local Recurrent Neural Memory for Supervised Sequence Modeling
Typical methods for supervised sequence modeling are built upon the recurrent
neural networks to capture temporal dependencies. One potential limitation of
these methods is that they only model explicitly information interactions
between adjacent time steps in a sequence, hence the high-order interactions
between nonadjacent time steps are not fully exploited. It greatly limits the
capability of modeling the long-range temporal dependencies since one-order
interactions cannot be maintained for a long term due to information dilution
and gradient vanishing. To tackle this limitation, we propose the Non-local
Recurrent Neural Memory (NRNM) for supervised sequence modeling, which performs
non-local operations to learn full-order interactions within a sliding temporal
block and models global interactions between blocks in a gated recurrent
manner. Consequently, our model is able to capture the long-range dependencies.
Besides, the latent high-level features contained in high-order interactions
can be distilled by our model. We demonstrate the merits of our NRNM on two
different tasks: action recognition and sentiment analysis.Comment: Accepted by ICCV 2019, Ora
Temporal Graph Modeling for Skeleton-based Action Recognition
Graph Convolutional Networks (GCNs), which model skeleton data as graphs,
have obtained remarkable performance for skeleton-based action recognition.
Particularly, the temporal dynamic of skeleton sequence conveys significant
information in the recognition task. For temporal dynamic modeling, GCN-based
methods only stack multi-layer 1D local convolutions to extract temporal
relations between adjacent time steps. With the repeat of a lot of local
convolutions, the key temporal information with non-adjacent temporal distance
may be ignored due to the information dilution. Therefore, these methods still
remain unclear how to fully explore temporal dynamic of skeleton sequence. In
this paper, we propose a Temporal Enhanced Graph Convolutional Network (TE-GCN)
to tackle this limitation. The proposed TE-GCN constructs temporal relation
graph to capture complex temporal dynamic. Specifically, the constructed
temporal relation graph explicitly builds connections between semantically
related temporal features to model temporal relations between both adjacent and
non-adjacent time steps. Meanwhile, to further explore the sufficient temporal
dynamic, multi-head mechanism is designed to investigate multi-kinds of
temporal relations. Extensive experiments are performed on two widely used
large-scale datasets, NTU-60 RGB+D and NTU-120 RGB+D. And experimental results
show that the proposed model achieves the state-of-the-art performance by
making contribution to temporal modeling for action recognition