1,817 research outputs found
Does Thermal Really Always Matter for RGB-T Salient Object Detection?
In recent years, RGB-T salient object detection (SOD) has attracted
continuous attention, which makes it possible to identify salient objects in
environments such as low light by introducing thermal image. However, most of
the existing RGB-T SOD models focus on how to perform cross-modality feature
fusion, ignoring whether thermal image is really always matter in SOD task.
Starting from the definition and nature of this task, this paper rethinks the
connotation of thermal modality, and proposes a network named TNet to solve the
RGB-T SOD task. In this paper, we introduce a global illumination estimation
module to predict the global illuminance score of the image, so as to regulate
the role played by the two modalities. In addition, considering the role of
thermal modality, we set up different cross-modality interaction mechanisms in
the encoding phase and the decoding phase. On the one hand, we introduce a
semantic constraint provider to enrich the semantics of thermal images in the
encoding phase, which makes thermal modality more suitable for the SOD task. On
the other hand, we introduce a two-stage localization and complementation
module in the decoding phase to transfer object localization cue and internal
integrity cue in thermal features to the RGB modality. Extensive experiments on
three datasets show that the proposed TNet achieves competitive performance
compared with 20 state-of-the-art methods.Comment: Accepted by IEEE Trans. Multimedia 2022, 13 pages, 9 figure
Multimodal Sensor Fusion In Single Thermal image Super-Resolution
With the fast growth in the visual surveillance and security sectors, thermal
infrared images have become increasingly necessary ina large variety of
industrial applications. This is true even though IR sensors are still more
expensive than their RGB counterpart having the same resolution. In this paper,
we propose a deep learning solution to enhance the thermal image resolution.
The following results are given:(I) Introduction of a multimodal,
visual-thermal fusion model that ad-dresses thermal image super-resolution, via
integrating high-frequency information from the visual image. (II)
Investigation of different net-work architecture schemes in the literature,
their up-sampling methods,learning procedures, and their optimization functions
by showing their beneficial contribution to the super-resolution problem. (III)
A bench-mark ULB17-VT dataset that contains thermal images and their visual
images counterpart is presented. (IV) Presentation of a qualitative evaluation
of a large test set with 58 samples and 22 raters which shows that our proposed
model performs better against state-of-the-arts
An Integrated Enhancement Solution for 24-hour Colorful Imaging
The current industry practice for 24-hour outdoor imaging is to use a silicon
camera supplemented with near-infrared (NIR) illumination. This will result in
color images with poor contrast at daytime and absence of chrominance at
nighttime. For this dilemma, all existing solutions try to capture RGB and NIR
images separately. However, they need additional hardware support and suffer
from various drawbacks, including short service life, high price, specific
usage scenario, etc. In this paper, we propose a novel and integrated
enhancement solution that produces clear color images, whether at abundant
sunlight daytime or extremely low-light nighttime. Our key idea is to separate
the VIS and NIR information from mixed signals, and enhance the VIS signal
adaptively with the NIR signal as assistance. To this end, we build an optical
system to collect a new VIS-NIR-MIX dataset and present a physically meaningful
image processing algorithm based on CNN. Extensive experiments show outstanding
results, which demonstrate the effectiveness of our solution.Comment: AAAI 2020 (Oral
Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle Re-identification
Multi-spectral vehicle re-identification aims to address the challenge of
identifying vehicles in complex lighting conditions by incorporating
complementary visible and infrared information. However, in harsh environments,
the discriminative cues in RGB and NIR modalities are often lost due to strong
flares from vehicle lamps or sunlight, and existing multi-modal fusion methods
are limited in their ability to recover these important cues. To address this
problem, we propose a Flare-Aware Cross-modal Enhancement Network that
adaptively restores flare-corrupted RGB and NIR features with guidance from the
flare-immunized thermal infrared spectrum. First, to reduce the influence of
locally degraded appearance due to intense flare, we propose a Mutual Flare
Mask Prediction module to jointly obtain flare-corrupted masks in RGB and NIR
modalities in a self-supervised manner. Second, to use the flare-immunized TI
information to enhance the masked RGB and NIR, we propose a Flare-Aware
Cross-modal Enhancement module that adaptively guides feature extraction of
masked RGB and NIR spectra with prior flare-immunized knowledge from the TI
spectrum. Third, to extract common informative semantic information from RGB
and NIR, we propose an Inter-modality Consistency loss that enforces semantic
consistency between the two modalities. Finally, to evaluate the proposed
FACENet in handling intense flare, we introduce a new multi-spectral vehicle
re-ID dataset, called WMVEID863, with additional challenges such as motion
blur, significant background changes, and particularly intense flare
degradation. Comprehensive experiments on both the newly collected dataset and
public benchmark multi-spectral vehicle re-ID datasets demonstrate the superior
performance of the proposed FACENet compared to state-of-the-art methods,
especially in handling strong flares. The code and dataset will be released
soon
- …