Search CORE

452 research outputs found

Dense Feature Aggregation and Pruning for RGBT Tracking

Author: Bertinetto Luca
Boyu Chen
Chenglong Li
Chenglong Li
Choi Jongwon
Conaire Ciaran O
Conaire Ciarán
Danelljan M.
Galoogahi Hamed Kiani
Jianhao Luo
Jianming Zhang
Jung Ilchae
Kim Han Ul
Lan Xiangyuan
Li Chenglong
Liu Huaping
Lukezic A.
Nam Hyeonseob
Nam Hyeonseob
Saihui Hou
Shiyi Hu
Simonyan Karen
Tao Kong
Valmadre Jack
Wu Yi
Wu Yi
Yang Li
Yuankai Qi
Zhipeng Zhang
Zhou Bolei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/07/2019
Field of study

How to perform effective information fusion of different modalities is a core factor in boosting the performance of RGBT tracking. This paper presents a novel deep fusion algorithm based on the representations from an end-to-end trained convolutional neural network. To deploy the complementarity of features of all layers, we propose a recursive strategy to densely aggregate these features that yield robust representations of target objects in each modality. In different modalities, we propose to prune the densely aggregated features of all modalities in a collaborative way. In a specific, we employ the operations of global average pooling and weighted random selection to perform channel scoring and selection, which could remove redundant and noisy features to achieve more robust feature representation. Experimental results on two RGBT tracking benchmark datasets suggest that our tracker achieves clear state-of-the-art against other RGB and RGBT tracking methods.Comment: arXiv admin note: text overlap with arXiv:1811.0985

arXiv.org e-Print Archive

RGB-T salient object detection via fusing multi-level CNN features

Author: Han Jungong
Huang Nianchang
Shan Caifeng
Yao Lin
Zhang Dingwen
Zhang Qiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/12/2019
Field of study

RGB-induced salient object detection has recently witnessed substantial progress, which is attributed to the superior feature learning capability of deep convolutional neural networks (CNNs). However, such detections suffer from challenging scenarios characterized by cluttered backgrounds, low-light conditions and variations in illumination. Instead of improving RGB based saliency detection, this paper takes advantage of the complementary benefits of RGB and thermal infrared images. Specifically, we propose a novel end-to-end network for multi-modal salient object detection, which turns the challenge of RGB-T saliency detection to a CNN feature fusion problem. To this end, a backbone network (e.g., VGG-16) is first adopted to extract the coarse features from each RGB or thermal infrared image individually, and then several adjacent-depth feature combination (ADFC) modules are designed to extract multi-level refined features for each single-modal input image, considering that features captured at different depths differ in semantic information and visual details. Subsequently, a multi-branch group fusion (MGF) module is employed to capture the cross-modal features by fusing those features from ADFC modules for a RGB-T image pair at each level. Finally, a joint attention guided bi-directional message passing (JABMP) module undertakes the task of saliency prediction via integrating the multi-level fused features from MGF modules. Experimental results on several public RGB-T salient object detection datasets demonstrate the superiorities of our proposed algorithm over the state-of-the-art approaches, especially under challenging conditions, such as poor illumination, complex background and low contrast

Warwick Research Archives Portal Repository

RGB-T Tracking Based on Mixed Attention

Author: Dong Mingtao
Guo Xiqing
Luo Yang
Yu Jin
Publication venue
Publication date: 10/04/2023
Field of study

RGB-T tracking involves the use of images from both visible and thermal modalities. The primary objective is to adaptively leverage the relatively dominant modality in varying conditions to achieve more robust tracking compared to single-modality tracking. An RGB-T tracker based on mixed attention mechanism to achieve complementary fusion of modalities (referred to as MACFT) is proposed in this paper. In the feature extraction stage, we utilize different transformer backbone branches to extract specific and shared information from different modalities. By performing mixed attention operations in the backbone to enable information interaction and self-enhancement between the template and search images, it constructs a robust feature representation that better understands the high-level semantic features of the target. Then, in the feature fusion stage, a modality-adaptive fusion is achieved through a mixed attention-based modality fusion network, which suppresses the low-quality modality noise while enhancing the information of the dominant modality. Evaluation on multiple RGB-T public datasets demonstrates that our proposed tracker outperforms other RGB-T trackers on general evaluation metrics while also being able to adapt to longterm tracking scenarios.Comment: 14 pages, 10 figure

arXiv.org e-Print Archive

Bounded PCA based Multi Sensor Image Fusion Employing Curvelet Transform Coefficients

Author: Chaudhuri Bidyarthi Baran
Chaudhuri Debasis
Mitra Shantanu
Singh Aniket Kumar
Singh Manish Pratap
Publication venue: Defence Scientific Information & Documentation Centre (DESIDOC), DRDO, India
Publication date: 01/11/2023
Field of study

The fusion of thermal and visible images acts as an important device for target detection. The quality of the spectral content of the fused image improves with wavelet-based image fusion. However, compared to PCA-based fusion, most wavelet-based methods provide results with a lower spatial resolution. The outcome gets better when the two approaches are combined, but they may still be refined. Compared to wavelets, the curvelet transforms more accurately depict the edges in the image. Enhancing the edges is a smart way to improve spatial resolution and the edges are crucial for interpreting the images. The fusion technique that utilizes curvelets enables the provision of additional data in both spectral and spatial areas concurrently. In this paper, we employ an amalgamation of Curvelet Transform and a Bounded PCA (CTBPCA) method to fuse thermal and visible images. To evidence the enhanced efficiency of our proposed technique, multiple evaluation metrics and comparisons with existing image merging methods are employed. Our approach outperforms others in both qualitative and quantitative analysis, except for runtime performance. Future Enhancement-The study will be based on using the fused image for target recognition. Future work should also focus on this method’s continued improvement and optimization for real-time video processing

SiamCDA:Complementarity-and distractor-aware RGB-T tracking based on Siamese network

Author: Han Jungong
Liu Xueru
Zhang Qiang
Zhang Tianlu
Publication venue
Publication date: 01/03/2022
Field of study