Search CORE

17,333 research outputs found

Learning Target-oriented Dual Attention for Robust RGB-T Tracking

Author: Li Chenglong
Tang Jin
Wang Xiao
Yang Rui
Zhu Yabin
Publication venue
Publication date: 12/08/2019
Field of study

RGB-Thermal object tracking attempt to locate target object using complementary visual and thermal infrared data. Existing RGB-T trackers fuse different modalities by robust feature representation learning or adaptive modal weighting. However, how to integrate dual attention mechanism for visual tracking is still a subject that has not been studied yet. In this paper, we propose two visual attention mechanisms for robust RGB-T object tracking. Specifically, the local attention is implemented by exploiting the common visual attention of RGB and thermal data to train deep classifiers. We also introduce the global attention, which is a multi-modal target-driven attention estimation network. It can provide global proposals for the classifier together with local proposals extracted from previous tracking result. Extensive experiments on two RGB-T benchmark datasets validated the effectiveness of our proposed algorithm.Comment: Accepted by IEEE ICIP 201

arXiv.org e-Print Archive

Crossref

Polar Fusion Technique Analysis for Evaluating the Performances of Image Fusion of Thermal and Visual Images for Human Face Recognition

Author: Basu Dipak Kumar
Bhattacharjee Debotosh
Bhowmik Mrinal Kanti
Nasipuri Mita
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/06/2011
Field of study

This paper presents a comparative study of two different methods, which are based on fusion and polar transformation of visual and thermal images. Here, investigation is done to handle the challenges of face recognition, which include pose variations, changes in facial expression, partial occlusions, variations in illumination, rotation through different angles, change in scale etc. To overcome these obstacles we have implemented and thoroughly examined two different fusion techniques through rigorous experimentation. In the first method log-polar transformation is applied to the fused images obtained after fusion of visual and thermal images whereas in second method fusion is applied on log-polar transformed individual visual and thermal images. After this step, which is thus obtained in one form or another, Principal Component Analysis (PCA) is applied to reduce dimension of the fused images. Log-polar transformed images are capable of handling complicacies introduced by scaling and rotation. The main objective of employing fusion is to produce a fused image that provides more detailed and reliable information, which is capable to overcome the drawbacks present in the individual visual and thermal face images. Finally, those reduced fused images are classified using a multilayer perceptron neural network. The database used for the experiments conducted here is Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database benchmark thermal and visual face images. The second method has shown better performance, which is 95.71% (maximum) and on an average 93.81% as correct recognition rate.Comment: Proceedings of IEEE Workshop on Computational Intelligence in Biometrics and Identity Management (IEEE CIBIM 2011), Paris, France, April 11 - 15, 201

arXiv.org e-Print Archive

Crossref

LSOTB-TIR:A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark

Author: Babenko B.
Bao Chenglong
Berg A.
Bertinetto Luca
Demir Huseyin Seckin
Dong Xingping
Felsberg Michael
Geirhos Robert
Gundogdu Erhan
Kristan M.
Li Xin
Qi Yuankai
van der Maaten Laurens
Wang Ning
Xu Jia
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/08/2020
Field of study

In this paper, we present a Large-Scale and high-diversity general Thermal InfraRed (TIR) Object Tracking Benchmark, called LSOTBTIR, which consists of an evaluation dataset and a training dataset with a total of 1,400 TIR sequences and more than 600K frames. We annotate the bounding box of objects in every frame of all sequences and generate over 730K bounding boxes in total. To the best of our knowledge, LSOTB-TIR is the largest and most diverse TIR object tracking benchmark to date. To evaluate a tracker on different attributes, we define 4 scenario attributes and 12 challenge attributes in the evaluation dataset. By releasing LSOTB-TIR, we encourage the community to develop deep learning based TIR trackers and evaluate them fairly and comprehensively. We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance. Furthermore, we re-train several representative deep trackers on LSOTB-TIR, and their results demonstrate that the proposed training dataset significantly improves the performance of deep TIR trackers. Codes and dataset are available at https://github.com/QiaoLiuHit/LSOTB-TIR.Comment: accepted by ACM Mutlimedia Conference, 202

arXiv.org e-Print Archive

Crossref