232 research outputs found

    Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

    Full text link
    Motion representation plays a vital role in human action recognition in videos. In this study, we introduce a novel compact motion representation for video action recognition, named Optical Flow guided Feature (OFF), which enables the network to distill temporal information through a fast and robust approach. The OFF is derived from the definition of optical flow and is orthogonal to the optical flow. The derivation also provides theoretical support for using the difference between two frames. By directly calculating pixel-wise spatiotemporal gradients of the deep feature maps, the OFF could be embedded in any existing CNN based video action recognition framework with only a slight additional cost. It enables the CNN to extract spatiotemporal information, especially the temporal information between frames simultaneously. This simple but powerful idea is validated by experimental results. The network with OFF fed only by RGB inputs achieves a competitive accuracy of 93.3% on UCF-101, which is comparable with the result obtained by two streams (RGB and optical flow), but is 15 times faster in speed. Experimental results also show that OFF is complementary to other motion modalities such as optical flow. When the proposed method is plugged into the state-of-the-art video action recognition framework, it has 96:0% and 74:2% accuracy on UCF-101 and HMDB-51 respectively. The code for this project is available at https://github.com/kevin-ssy/Optical-Flow-Guided-Feature.Comment: CVPR 2018. code available at https://github.com/kevin-ssy/Optical-Flow-Guided-Featur

    V2V Routing in VANET Based on Heuristic Q-Learning

    Get PDF
    Designing efficient routing algorithms in vehicular ad hoc networks (VANETs) plays an important role in the emerging intelligent transportation systems. In this paper, a routing algorithm based on the improved Q-learning is proposed for vehicle-to-vehicle (V2V) communications in VANETs. Firstly, a link maintenance time model is established, and the maintenance time is taken as an important parameter in the design of routing algorithm to ensure the reliability of each hop link. Aiming at the low efficiency and slow convergence of Q-learning, heuristic function and evaluation function are introduced to accelerate the update of Q-value of current optimal action, reduce unnecessary exploration, accelerate the convergence speed of Q-learning process and improve learning efficiency. The learning task is dispersed in each vehicle node in the new routing algorithm and it maintains the reliable routing path by periodically exchanging beacon information with surrounding nodes, guides the node’s forwarding action by combining the delay information between nodes to improve the efficiency of data forwarding. The performance of the algorithm is evaluated by NS2 simulator. The results show that the algorithm has a good effect on the package delivery rate and end-to-end delay

    I2^2MD: 3D Action Representation Learning with Inter- and Intra-modal Mutual Distillation

    Full text link
    Recent progresses on self-supervised 3D human action representation learning are largely attributed to contrastive learning. However, in conventional contrastive frameworks, the rich complementarity between different skeleton modalities remains under-explored. Moreover, optimized with distinguishing self-augmented samples, models struggle with numerous similar positive instances in the case of limited action categories. In this work, we tackle the aforementioned problems by introducing a general Inter- and Intra-modal Mutual Distillation (I2^2MD) framework. In I2^2MD, we first re-formulate the cross-modal interaction as a Cross-modal Mutual Distillation (CMD) process. Different from existing distillation solutions that transfer the knowledge of a pre-trained and fixed teacher to the student, in CMD, the knowledge is continuously updated and bidirectionally distilled between modalities during pre-training. To alleviate the interference of similar samples and exploit their underlying contexts, we further design the Intra-modal Mutual Distillation (IMD) strategy, In IMD, the Dynamic Neighbors Aggregation (DNA) mechanism is first introduced, where an additional cluster-level discrimination branch is instantiated in each modality. It adaptively aggregates highly-correlated neighboring features, forming local cluster-level contrasting. Mutual distillation is then performed between the two branches for cross-level knowledge exchange. Extensive experiments on three datasets show that our approach sets a series of new records.Comment: submitted to IJCV. arXiv admin note: substantial text overlap with arXiv:2208.1244
    • …
    corecore