Learning Motion Constraint-Based Spatio-Temporal Networks for Infrared Dim Target Detections

Abstract

Efficient infrared dim object detection has been challenged by low signal-to-noise ratios (SNRs). Traditional methods rely on the gradient difference and fixed-parameter model. These methods fail to adapt to sophisticated and variable situations in the real world. To tackle the issue, a deep learning method based on the spatio-temporal network is proposed in this paper. The model is established by the Convolutional Long Short-Term Memory cell (Conv-LSTM) and the 3D Convolution cell (3D-Conv). It is trained to learn the motion constraint of moving targets (spatio-temporal constraint module, called STM) and to fuse the multiscale local feature between the target and background (deep spatial features module, called DFM). In addition, a variable interval search module (state-aware module, called STAM) is added to the inference. The submodule decides to conduct a global search for images only if the target is lost due to fast motion, uncertain obstruction, and frame loss. Comprehensive experiments indicate that the proposed method achieves better performance over all baseline methods. On the mid-wave infrared datasets collected by the authors, the proposed method achieves a 95.87% detection rate. The SNR of the dataset is around 1–3 dB, and the background of the sequence includes sky, asphalt road, and buildings

    Similar works

    Full text

    thumbnail-image