99 research outputs found
Rethinking the Pipeline of Demosaicing, Denoising and Super-Resolution
Incomplete color sampling, noise degradation, and limited resolution are the
three key problems that are unavoidable in modern camera systems. Demosaicing
(DM), denoising (DN), and super-resolution (SR) are core components in a
digital image processing pipeline to overcome the three problems above,
respectively. Although each of these problems has been studied actively, the
mixture problem of DM, DN, and SR, which is a higher practical value, lacks
enough attention. Such a mixture problem is usually solved by a sequential
solution (applying each method independently in a fixed order: DM DN
SR), or is simply tackled by an end-to-end network without enough
analysis into interactions among tasks, resulting in an undesired performance
drop in the final image quality. In this paper, we rethink the mixture problem
from a holistic perspective and propose a new image processing pipeline: DN
SR DM. Extensive experiments show that simply modifying the usual
sequential solution by leveraging our proposed pipeline could enhance the image
quality by a large margin. We further adopt the proposed pipeline into an
end-to-end network, and present Trinity Enhancement Network (TENet).
Quantitative and qualitative experiments demonstrate the superiority of our
TENet to the state-of-the-art. Besides, we notice the literature lacks a full
color sampled dataset. To this end, we contribute a new high-quality full color
sampled real-world dataset, namely PixelShift200. Our experiments show the
benefit of the proposed PixelShift200 dataset for raw image processing.Comment: Code is available at: https://github.com/guochengqian/TENe
Recursive Generalization Transformer for Image Super-Resolution
Transformer architectures have exhibited remarkable performance in image
super-resolution (SR). Since the quadratic computational complexity of the
self-attention (SA) in Transformer, existing methods tend to adopt SA in a
local region to reduce overheads. However, the local design restricts the
global context exploitation, which is crucial for accurate image
reconstruction. In this work, we propose the Recursive Generalization
Transformer (RGT) for image SR, which can capture global spatial information
and is suitable for high-resolution images. Specifically, we propose the
recursive-generalization self-attention (RG-SA). It recursively aggregates
input features into representative feature maps, and then utilizes
cross-attention to extract global information. Meanwhile, the channel
dimensions of attention matrices (query, key, and value) are further scaled to
mitigate the redundancy in the channel domain. Furthermore, we combine the
RG-SA with local self-attention to enhance the exploitation of the global
context, and propose the hybrid adaptive integration (HAI) for module
integration. The HAI allows the direct and effective fusion between features at
different levels (local or global). Extensive experiments demonstrate that our
RGT outperforms recent state-of-the-art methods quantitatively and
qualitatively. Code is released at https://github.com/zhengchen1999/RGT.Comment: Code is available at https://github.com/zhengchen1999/RG
Networks are Slacking Off: Understanding Generalization Problem in Image Deraining
Deep deraining networks, while successful in laboratory benchmarks,
consistently encounter substantial generalization issues when deployed in
real-world applications. A prevailing perspective in deep learning encourages
the use of highly complex training data, with the expectation that a richer
image content knowledge will facilitate overcoming the generalization problem.
However, through comprehensive and systematic experimentation, we discovered
that this strategy does not enhance the generalization capability of these
networks. On the contrary, it exacerbates the tendency of networks to overfit
to specific degradations. Our experiments reveal that better generalization in
a deraining network can be achieved by simplifying the complexity of the
training data. This is due to the networks are slacking off during training,
that is, learning the least complex elements in the image content and
degradation to minimize training loss. When the complexity of the background
image is less than that of the rain streaks, the network will prioritize the
reconstruction of the background, thereby avoiding overfitting to the rain
patterns and resulting in improved generalization performance. Our research not
only offers a valuable perspective and methodology for better understanding the
generalization problem in low-level vision tasks, but also displays promising
practical potential
- …