Search CORE

462 research outputs found

SeqTrack: Sequence to Sequence Learning for Visual Object Tracking

Author: Chen Xin
Hu Han
Lu Huchuan
Peng Houwen
Wang Dong
Publication venue
Publication date: 17/08/2023
Field of study

In this paper, we present a new sequence-to-sequence learning framework for visual tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem, which predicts object bounding boxes in an autoregressive fashion. This is different from prior Siamese trackers and transformer trackers, which rely on designing complicated head networks, such as classification and regression heads. SeqTrack only adopts a simple encoder-decoder transformer architecture. The encoder extracts visual features with a bidirectional transformer, while the decoder generates a sequence of bounding box values autoregressively with a causal transformer. The loss function is a plain cross-entropy. Such a sequence learning paradigm not only simplifies tracking framework, but also achieves competitive performance on benchmarks. For instance, SeqTrack gets 72.5% AUC on LaSOT, establishing a new state-of-the-art performance. Code and models are available at here.Comment: CVPR2023 pape

arXiv.org e-Print Archive

CiteTracker: Correlating Image and Text for Visual Tracking

Author: He Zhenyu
Huang Yuqing
Li Xin
Lu Huchuan
Wang Yaowei
Yang Ming-Hsuan
Publication venue
Publication date: 22/08/2023
Field of study

Existing visual tracking methods typically take an image patch as the reference of the target to perform tracking. However, a single image patch cannot provide a complete and precise concept of the target object as images are limited in their ability to abstract and can be ambiguous, which makes it difficult to track targets with drastic variations. In this paper, we propose the CiteTracker to enhance target modeling and inference in visual tracking by connecting images and text. Specifically, we develop a text generation module to convert the target image patch into a descriptive text containing its class and attribute information, providing a comprehensive reference point for the target. In addition, a dynamic description module is designed to adapt to target variations for more effective target representation. We then associate the target description and the search image using an attention-based correlation module to generate the correlated features for target state reference. Extensive experiments on five diverse datasets are conducted to evaluate the proposed algorithm and the favorable performance against the state-of-the-art methods demonstrates the effectiveness of the proposed tracking method.Comment: accepted by ICCV 202

arXiv.org e-Print Archive

Learning to Segment Dynamic Objects using SLAM Outliers

Author: Bojko Adrian
Borgne Hervé Le
Dupont Romain
Tamaazousti Mohamed
Publication venue
Publication date: 12/11/2020
Field of study

We present a method to automatically learn to segment dynamic objects using SLAM outliers. It requires only one monocular sequence per dynamic object for training and consists in localizing dynamic objects using SLAM outliers, creating their masks, and using these masks to train a semantic segmentation network. We integrate the trained network in ORB-SLAM 2 and LDSO. At runtime we remove features on dynamic objects, making the SLAM unaffected by them. We also propose a new stereo dataset and new metrics to evaluate SLAM robustness. Our dataset includes consensus inversions, i.e., situations where the SLAM uses more features on dynamic objects that on the static background. Consensus inversions are challenging for SLAM as they may cause major SLAM failures. Our approach performs better than the State-of-the-Art on the TUM RGB-D dataset in monocular mode and on our dataset in both monocular and stereo modes.Comment: Accepted to ICPR 202

arXiv.org e-Print Archive

HAL-CEA

Resource-Efficient RGBD Aerial Tracking

Author: Gao Shang
Leonardis Ales
Li Zhe
Yang Jinyu
Zheng Feng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/06/2023
Field of study

University of Birmingham Research Portal