69 research outputs found
There and Back Again: Self-supervised Multispectral Correspondence Estimation
Across a wide range of applications, from autonomous vehicles to medical
imaging, multi-spectral images provide an opportunity to extract additional
information not present in color images. One of the most important steps in
making this information readily available is the accurate estimation of dense
correspondences between different spectra.
Due to the nature of cross-spectral images, most correspondence solving
techniques for the visual domain are simply not applicable. Furthermore, most
cross-spectral techniques utilize spectra-specific characteristics to perform
the alignment. In this work, we aim to address the dense correspondence
estimation problem in a way that generalizes to more than one spectrum. We do
this by introducing a novel cycle-consistency metric that allows us to
self-supervise. This, combined with our spectra-agnostic loss functions, allows
us to train the same network across multiple spectra.
We demonstrate our approach on the challenging task of dense RGB-FIR
correspondence estimation. We also show the performance of our unmodified
network on the cases of RGB-NIR and RGB-RGB, where we achieve higher accuracy
than similar self-supervised approaches. Our work shows that cross-spectral
correspondence estimation can be solved in a common framework that learns to
generalize alignment across spectra
Converting Optical Videos to Infrared Videos Using Attention GAN and Its Impact on Target Detection and Classification Performance
To apply powerful deep-learning-based algorithms for object detection and classification in infrared videos, it is necessary to have more training data in order to build high-performance models. However, in many surveillance applications, one can have a lot more optical videos than infrared videos. This lack of IR video datasets can be mitigated if optical-to-infrared video conversion is possible. In this paper, we present a new approach for converting optical videos to infrared videos using deep learning. The basic idea is to focus on target areas using attention generative adversarial network (attention GAN), which will preserve the fidelity of target areas. The approach does not require paired images. The performance of the proposed attention GAN has been demonstrated using objective and subjective evaluations. Most importantly, the impact of attention GAN has been demonstrated in improved target detection and classification performance using real-infrared videos
Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods
This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended
- …