Multimodal Representation Learning and Set Attention for LWIR In-Scene Atmospheric Compensation

Abstract

A multimodal generative modeling approach combined with permutation-invariant set attention is investigated in this paper to support long-wave infrared (LWIR) in-scene atmospheric compensation. The generative model can produce realistic atmospheric state vectors (T;H2O;O3) and their corresponding transmittance, upwelling radiance, and downwelling radiance (TUD) vectors by sampling a low-dimensional space. Variational loss, LWIR radiative transfer loss and atmospheric state loss constrain the low-dimensional space, resulting in lower reconstruction error compared to standard mean-squared error approaches. A permutation-invariant network predicts the generative model low-dimensional components from in-scene data, allowing for simultaneous estimates of the atmospheric state and TUD vector. Forward modeling the predicted atmospheric state vector results in a second atmospheric compensation estimate. Results are reported for collected LWIR data and compared to Fast Line-of-Sight Atmospheric Analysis of Hypercubes - Infrared (FLAASH-IR), demonstrating commensurate performance when applied to a target detection scenario. Additionally, an approximate 8 times reduction in detection time is realized using this neural network-based algorithm compared to FLAASH-IR. Accelerating the target detection pipeline while providing multiple atmospheric estimates is necessary for many real-world, time sensitive tasks

    Similar works