73 research outputs found
Learning Task-Oriented Flows to Mutually Guide Feature Alignment in Synthesized and Real Video Denoising
Video denoising aims at removing noise from videos to recover clean ones.
Some existing works show that optical flow can help the denoising by exploiting
the additional spatial-temporal clues from nearby frames. However, the flow
estimation itself is also sensitive to noise, and can be unusable under large
noise levels. To this end, we propose a new multi-scale refined optical
flow-guided video denoising method, which is more robust to different noise
levels. Our method mainly consists of a denoising-oriented flow refinement
(DFR) module and a flow-guided mutual denoising propagation (FMDP) module.
Unlike previous works that directly use off-the-shelf flow solutions, DFR first
learns robust multi-scale optical flows, and FMDP makes use of the flow
guidance by progressively introducing and refining more flow information from
low resolution to high resolution. Together with real noise degradation
synthesis, the proposed multi-scale flow-guided denoising network achieves
state-of-the-art performance on both synthetic Gaussian denoising and real
video denoising. The codes will be made publicly available
Image Denoising: Invertible and General Denoising Frameworks
The widespread use of digital cameras has resulted in a massive number of images being taken every day. However, due to the limitations of sensors and environments such as light conditions, the images are usually contaminated by noise. Obtaining visually clean images are essential for the accuracy of downstream high-level vision tasks. Thus, denoising is a crucial preprocessing step.
A fundamental challenge in image denoising is to restore recognizable frequencies in edge and fine-scaled texture regions. Traditional methods usually employ hand-crafted priors to enhance the restoration of these high frequency regions, which seem to be omitted in current deep learning models. We explored whether the clean gradients can be utilized in deep networks as a prior as well as how to incorporate this prior in the networks to boost recovery of missing or obscured picture elements. We present results showing that fusing the pre-denoised images' gradient in the shallow layer contributes to recovering better edges and textures. We also propose a regularization loss term to ensure that the reconstructed images' gradients are close to the clean gradients. Both techniques are indispensable for enhancing the restored image frequencies.
We also studied how to make the network preserve input information for better restoration of the high-frequency details. According to the definition of mutual information, we presented that invertibility is indispensable for information losslessness. Then, we proposed the Invertible Restoring Autoencoder (IRAE) network, a multiscale invertible encoder-decoder network. The superiority of this network was verified on three different low-level tasks, image denoising, JPEG image decompression and image inpainting. IRAE showed a good direction to explore more invertible architectures for image restoration.
We attempted to further reduce the model size of invertible restoration networks. Our intuition was to use the same learned parameters to encode the noisy images in the forward pass and reconstruct the clean images in the backward pass. However, existing invertible networks use the same distribution for both the input and output obtained in the reversed pass. For our noise removal purpose, the input is noisy, but the reversed output is clean, following two different distributions. It was challenging to design lightweight invertible architectures for denoising. We presented InvDN, converting the noisy input to a clean low-resolution image and a noisy latent representation. To address the challenge mentioned above, we replaced the noisy representation with a clean one random sampled from Gaussian during the reverse pass. InvDN achieved state-of-the-art on real image denoising with much fewer parameters and less run time than existing state-of-the-art models. In addition, InvDN could also generate new noisy images for data augmentation.
We also rethought image denoising from a novel aspect and introduced a more general denoising framework. Our framework utilized invertible networks to learn a noisy image distribution, which could be considered as the joint distribution of clean content and noise. The noisy input was mapped to representations in the latent space. A novel disentanglement strategy was applied to the latent representations to obtain the representations for the clean content, which were passed to the reversed network to get the clean image. Since this concept was a novel attempt, we also explored different data augmentation and training strategies for this framework. The proposed FDN was trained and tested from simple to complex tasks on distribution-clear class-specific synthetic noisy datasets, more general remote sensing datasets, and real noisy datasets and achieved competitive results with fewer parameters and faster speed. This work contributed a novel perspective and potential direction to design low-level task models in the future
Multi-scale Adaptive Fusion Network for Hyperspectral Image Denoising
Removing the noise and improving the visual quality of hyperspectral images
(HSIs) is challenging in academia and industry. Great efforts have been made to
leverage local, global or spectral context information for HSI denoising.
However, existing methods still have limitations in feature interaction
exploitation among multiple scales and rich spectral structure preservation. In
view of this, we propose a novel solution to investigate the HSI denoising
using a Multi-scale Adaptive Fusion Network (MAFNet), which can learn the
complex nonlinear mapping between clean and noisy HSI. Two key components
contribute to improving the hyperspectral image denoising: A progressively
multiscale information aggregation network and a co-attention fusion module.
Specifically, we first generate a set of multiscale images and feed them into a
coarse-fusion network to exploit the contextual texture correlation.
Thereafter, a fine fusion network is followed to exchange the information
across the parallel multiscale subnetworks. Furthermore, we design a
co-attention fusion module to adaptively emphasize informative features from
different scales, and thereby enhance the discriminative learning capability
for denoising. Extensive experiments on synthetic and real HSI datasets
demonstrate that the proposed MAFNet has achieved better denoising performance
than other state-of-the-art techniques. Our codes are available at
\verb'https://github.com/summitgao/MAFNet'.Comment: IEEE JSTASRS 2023, code at: https://github.com/summitgao/MAFNe
- …