62 research outputs found
Deep Blind Super-Resolution for Satellite Video
Recent efforts have witnessed remarkable progress in Satellite Video
Super-Resolution (SVSR). However, most SVSR methods usually assume the
degradation is fixed and known, e.g., bicubic downsampling, which makes them
vulnerable in real-world scenes with multiple and unknown degradations. To
alleviate this issue, blind SR has thus become a research hotspot.
Nevertheless, existing approaches are mainly engaged in blur kernel estimation
while losing sight of another critical aspect for VSR tasks: temporal
compensation, especially compensating for blurry and smooth pixels with vital
sharpness from severely degraded satellite videos. Therefore, this paper
proposes a practical Blind SVSR algorithm (BSVSR) to explore more sharp cues by
considering the pixel-wise blur levels in a coarse-to-fine manner.
Specifically, we employed multi-scale deformable convolution to coarsely
aggregate the temporal redundancy into adjacent frames by window-slid
progressive fusion. Then the adjacent features are finely merged into
mid-feature using deformable attention, which measures the blur levels of
pixels and assigns more weights to the informative pixels, thus inspiring the
representation of sharpness. Moreover, we devise a pyramid spatial
transformation module to adjust the solution space of sharp mid-feature,
resulting in flexible feature adaptation in multi-level domains. Quantitative
and qualitative evaluations on both simulated and real-world satellite videos
demonstrate that our BSVSR performs favorably against state-of-the-art
non-blind and blind SR models. Code will be available at
https://github.com/XY-boy/Blind-Satellite-VSRComment: Published in IEEE TGR
DeepSUM: Deep Neural Network for Super-Resolution of Unregistered Multitemporal Images
Recently, convolutional neural networks (CNNs) have been successfully applied to many remote sensing problems. However, deep learning techniques for multi-image super-resolution (SR) from multitemporal unregistered imagery have received little attention so far. This article proposes a novel CNN-based technique that exploits both spatial and temporal correlations to combine multiple images. This novel framework integrates the spatial registration task directly inside the CNN, and allows one to exploit the representation learning capabilities of the network to enhance registration accuracy. The entire SR process relies on a single CNN with three main stages: shared 2-D convolutions to extract high-dimensional features from the input images; a subnetwork proposing registration filters derived from the high-dimensional feature representations; 3-D convolutions for slow fusion of the features from multiple images. The whole network can be trained end-to-end to recover a single high-resolution image from multiple unregistered low-resolution images. The method presented in this article is the winner of the PROBA-V SR challenge issued by the European Space Agency (ESA)
Recent Advances in Image Restoration with Applications to Real World Problems
In the past few decades, imaging hardware has improved tremendously in terms of resolution, making widespread usage of images in many diverse applications on Earth and planetary missions. However, practical issues associated with image acquisition are still affecting image quality. Some of these issues such as blurring, measurement noise, mosaicing artifacts, low spatial or spectral resolution, etc. can seriously affect the accuracy of the aforementioned applications. This book intends to provide the reader with a glimpse of the latest developments and recent advances in image restoration, which includes image super-resolution, image fusion to enhance spatial, spectral resolution, and temporal resolutions, and the generation of synthetic images using deep learning techniques. Some practical applications are also included
SEG-ESRGAN: A multi-task network for super-resolution and semantic segmentation of remote sensing images
The production of highly accurate land cover maps is one of the primary challenges in remote sensing, which depends on the spatial resolution of the input images. Sometimes, high-resolution imagery is not available or is too expensive to cover large areas or to perform multitemporal analysis. In this context, we propose a multi-task network to take advantage of the freely available Sentinel-2 imagery to produce a super-resolution image, with a scaling factor of 5, and the corresponding high-resolution land cover map. Our proposal, named SEG-ESRGAN, consists of two branches: the super-resolution branch, that produces Sentinel-2 multispectral images at 2 m resolution, and an encoder–decoder architecture for the semantic segmentation branch, that generates the enhanced land cover map. From the super-resolution branch, several skip connections are retrieved and concatenated with features from the different stages of the encoder part of the segmentation branch, promoting the flow of meaningful information to boost the accuracy in the segmentation task. Our model is trained with a multi-loss approach using a novel dataset to train and test the super-resolution stage, which is developed from Sentinel-2 and WorldView-2 image pairs. In addition, we generated a dataset with ground-truth labels for the segmentation task. To assess the super-resolution improvement, the PSNR, SSIM, ERGAS, and SAM metrics were considered, while to measure the classification performance, we used the IoU, confusion matrix and the F1-score. Experimental results demonstrate that the SEG-ESRGAN model outperforms different full segmentation and dual network models (U-Net, DeepLabV3+, HRNet and Dual_DeepLab), allowing the generation of high-resolution land cover maps in challenging scenarios using Sentinel-2 10 m bands.This work was funded by the Spanish Agencia Estatal de Investigación (AEI) under projects ARTEMISAT-2 (CTM 2016-77733-R), PID2020-117142GB-I00 and PID2020-116907RB-I00 (MCIN/AEI call 10.13039/501100011033). L.S. would like to acknowledge the BECAL (Becas Carlos
Antonio López) scholarship for the financial support.Peer ReviewedPostprint (published version
Deep learning for inverse problems in remote sensing: super-resolution and SAR despeckling
L'abstract è presente nell'allegato / the abstract is in the attachmen
FedDiff: Diffusion Model Driven Federated Learning for Multi-Modal and Multi-Clients
With the rapid development of imaging sensor technology in the field of
remote sensing, multi-modal remote sensing data fusion has emerged as a crucial
research direction for land cover classification tasks. While diffusion models
have made great progress in generative models and image classification tasks,
existing models primarily focus on single-modality and single-client control,
that is, the diffusion process is driven by a single modal in a single
computing node. To facilitate the secure fusion of heterogeneous data from
clients, it is necessary to enable distributed multi-modal control, such as
merging the hyperspectral data of organization A and the LiDAR data of
organization B privately on each base station client. In this study, we propose
a multi-modal collaborative diffusion federated learning framework called
FedDiff. Our framework establishes a dual-branch diffusion model feature
extraction setup, where the two modal data are inputted into separate branches
of the encoder. Our key insight is that diffusion models driven by different
modalities are inherently complementary in terms of potential denoising steps
on which bilateral connections can be built. Considering the challenge of
private and efficient communication between multiple clients, we embed the
diffusion model into the federated learning communication structure, and
introduce a lightweight communication module. Qualitative and quantitative
experiments validate the superiority of our framework in terms of image quality
and conditional consistency
Continuous Remote Sensing Image Super-Resolution based on Context Interaction in Implicit Function Space
Despite its fruitful applications in remote sensing, image super-resolution
is troublesome to train and deploy as it handles different resolution
magnifications with separate models. Accordingly, we propose a
highly-applicable super-resolution framework called FunSR, which settles
different magnifications with a unified model by exploiting context interaction
within implicit function space. FunSR composes a functional representor, a
functional interactor, and a functional parser. Specifically, the representor
transforms the low-resolution image from Euclidean space to multi-scale
pixel-wise function maps; the interactor enables pixel-wise function expression
with global dependencies; and the parser, which is parameterized by the
interactor's output, converts the discrete coordinates with additional
attributes to RGB values. Extensive experimental results demonstrate that FunSR
reports state-of-the-art performance on both fixed-magnification and
continuous-magnification settings, meanwhile, it provides many friendly
applications thanks to its unified nature
- …