630 research outputs found
Depth Estimation and Image Restoration by Deep Learning from Defocused Images
Monocular depth estimation and image deblurring are two fundamental tasks in
computer vision, given their crucial role in understanding 3D scenes.
Performing any of them by relying on a single image is an ill-posed problem.
The recent advances in the field of Deep Convolutional Neural Networks (DNNs)
have revolutionized many tasks in computer vision, including depth estimation
and image deblurring. When it comes to using defocused images, the depth
estimation and the recovery of the All-in-Focus (Aif) image become related
problems due to defocus physics. Despite this, most of the existing models
treat them separately. There are, however, recent models that solve these
problems simultaneously by concatenating two networks in a sequence to first
estimate the depth or defocus map and then reconstruct the focused image based
on it. We propose a DNN that solves the depth estimation and image deblurring
in parallel. Our Two-headed Depth Estimation and Deblurring Network (2HDED:NET)
extends a conventional Depth from Defocus (DFD) networks with a deblurring
branch that shares the same encoder as the depth branch. The proposed method
has been successfully tested on two benchmarks, one for indoor and the other
for outdoor scenes: NYU-v2 and Make3D. Extensive experiments with 2HDED:NET on
these benchmarks have demonstrated superior or close performances to those of
the state-of-the-art models for depth estimation and image deblurring
End-to-end Alternating Optimization for Real-World Blind Super Resolution
Blind Super-Resolution (SR) usually involves two sub-problems: 1) estimating
the degradation of the given low-resolution (LR) image; 2) super-resolving the
LR image to its high-resolution (HR) counterpart. Both problems are ill-posed
due to the information loss in the degrading process. Most previous methods try
to solve the two problems independently, but often fall into a dilemma: a good
super-resolved HR result requires an accurate degradation estimation, which
however, is difficult to be obtained without the help of original HR
information. To address this issue, instead of considering these two problems
independently, we adopt an alternating optimization algorithm, which can
estimate the degradation and restore the SR image in a single model.
Specifically, we design two convolutional neural modules, namely
\textit{Restorer} and \textit{Estimator}. \textit{Restorer} restores the SR
image based on the estimated degradation, and \textit{Estimator} estimates the
degradation with the help of the restored SR image. We alternate these two
modules repeatedly and unfold this process to form an end-to-end trainable
network. In this way, both \textit{Restorer} and \textit{Estimator} could get
benefited from the intermediate results of each other, and make each
sub-problem easier. Moreover, \textit{Restorer} and \textit{Estimator} are
optimized in an end-to-end manner, thus they could get more tolerant of the
estimation deviations of each other and cooperate better to achieve more robust
and accurate final results. Extensive experiments on both synthetic datasets
and real-world images show that the proposed method can largely outperform
state-of-the-art methods and produce more visually favorable results. The codes
are rleased at \url{https://github.com/greatlog/RealDAN.git}.Comment: Extension of our previous NeurIPS paper. Accepted to IJC
Learning A Coarse-to-Fine Diffusion Transformer for Image Restoration
Recent years have witnessed the remarkable performance of diffusion models in
various vision tasks. However, for image restoration that aims to recover clear
images with sharper details from given degraded observations, diffusion-based
methods may fail to recover promising results due to inaccurate noise
estimation. Moreover, simple constraining noises cannot effectively learn
complex degradation information, which subsequently hinders the model capacity.
To solve the above problems, we propose a coarse-to-fine diffusion Transformer
(C2F-DFT) for image restoration. Specifically, our C2F-DFT contains diffusion
self-attention (DFSA) and diffusion feed-forward network (DFN) within a new
coarse-to-fine training scheme. The DFSA and DFN respectively capture the
long-range diffusion dependencies and learn hierarchy diffusion representation
to facilitate better restoration. In the coarse training stage, our C2F-DFT
estimates noises and then generates the final clean image by a sampling
algorithm. To further improve the restoration quality, we propose a simple yet
effective fine training scheme. It first exploits the coarse-trained diffusion
model with fixed steps to generate restoration results, which then would be
constrained with corresponding ground-truth ones to optimize the models to
remedy the unsatisfactory results affected by inaccurate noise estimation.
Extensive experiments show that C2F-DFT significantly outperforms
diffusion-based restoration method IR-SDE and achieves competitive performance
compared with Transformer-based state-of-the-art methods on tasks,
including deraining, deblurring, and real denoising.Comment: 9 pages, 8 figure
Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising
Image denoising is a fundamental and challenging task in the field of
computer vision. Most supervised denoising methods learn to reconstruct clean
images from noisy inputs, which have intrinsic spectral bias and tend to
produce over-smoothed and blurry images. Recently, researchers have explored
diffusion models to generate high-frequency details in image restoration tasks,
but these models do not guarantee that the generated texture aligns with real
images, leading to undesirable artifacts. To address the trade-off between
visual appeal and fidelity of high-frequency details in denoising tasks, we
propose a novel approach called the Reconstruct-and-Generate Diffusion Model
(RnG). Our method leverages a reconstructive denoising network to recover the
majority of the underlying clean signal, which serves as the initial estimation
for subsequent steps to maintain fidelity. Additionally, it employs a diffusion
algorithm to generate residual high-frequency details, thereby enhancing visual
quality. We further introduce a two-stage training scheme to ensure effective
collaboration between the reconstructive and generative modules of RnG. To
reduce undesirable texture introduced by the diffusion model, we also propose
an adaptive step controller that regulates the number of inverse steps applied
by the diffusion model, allowing control over the level of high-frequency
details added to each patch as well as saving the inference computational cost.
Through our proposed RnG, we achieve a better balance between perception and
distortion. We conducted extensive experiments on both synthetic and real
denoising datasets, validating the superiority of the proposed approach
Enhancing Image Quality: A Comparative Study of Spatial, Frequency Domain, and Deep Learning Methods
Image restoration and noise reduction methods have been created to restore deteriorated images and improve their quality. These methods have garnered substantial significance in recent times, mainly due to the growing utilization of digital imaging across diverse domains, including but not limited to medical imaging, surveillance, satellite imaging, and numerous others.
In this paper, we conduct a comparative analysis of three distinct approaches to image restoration: the spatial method, the frequency domain method, and the deep learning method. The study was conducted on a dataset of 10,000 images, and the performance of each method was evaluated using the accuracy and loss metrics. The results show that the deep learning method outperformed the other two methods, achieving a validation accuracy of 72.68% after 10 epochs. The spatial method had the lowest accuracy of the three, achieving a validation accuracy of 69.98% after 10 epochs. The FFT frequency domain method had a validation accuracy of 52.87% after 10 epochs, significantly lower than the other two methods. The study demonstrates that deep learning is a promising approach for image classification tasks and outperforms traditional methods such as spatial and frequency domain techniques
PromptIR: Prompting for All-in-One Blind Image Restoration
Image restoration involves recovering a high-quality clean image from its
degraded version. Deep learning-based methods have significantly improved image
restoration performance, however, they have limited generalization ability to
different degradation types and levels. This restricts their real-world
application since it requires training individual models for each specific
degradation and knowing the input degradation type to apply the relevant model.
We present a prompt-based learning approach, PromptIR, for All-In-One image
restoration that can effectively restore images from various types and levels
of degradation. In particular, our method uses prompts to encode
degradation-specific information, which is then used to dynamically guide the
restoration network. This allows our method to generalize to different
degradation types and levels, while still achieving state-of-the-art results on
image denoising, deraining, and dehazing. Overall, PromptIR offers a generic
and efficient plugin module with few lightweight prompts that can be used to
restore images of various types and levels of degradation with no prior
information on the corruptions present in the image. Our code and pretrained
models are available here: https://github.com/va1shn9v/PromptI
Panchromatic and multispectral image fusion for remote sensing and earth observation: Concepts, taxonomy, literature review, evaluation methodologies and challenges ahead
Panchromatic and multispectral image fusion, termed pan-sharpening, is to merge the spatial and spectral information of the source images into a fused one, which has a higher spatial and spectral resolution and is more reliable for downstream tasks compared with any of the source images. It has been widely applied to image interpretation and pre-processing of various applications. A large number of methods have been proposed to achieve better fusion results by considering the spatial and spectral relationships among panchromatic and multispectral images. In recent years, the fast development of artificial intelligence (AI) and deep learning (DL) has significantly enhanced the development of pan-sharpening techniques. However, this field lacks a comprehensive overview of recent advances boosted by the rise of AI and DL. This paper provides a comprehensive review of a variety of pan-sharpening methods that adopt four different paradigms, i.e., component substitution, multiresolution analysis, degradation model, and deep neural networks. As an important aspect of pan-sharpening, the evaluation of the fused image is also outlined to present various assessment methods in terms of reduced-resolution and full-resolution quality measurement. Then, we conclude this paper by discussing the existing limitations, difficulties, and challenges of pan-sharpening techniques, datasets, and quality assessment. In addition, the survey summarizes the development trends in these areas, which provide useful methodological practices for researchers and professionals. Finally, the developments in pan-sharpening are summarized in the conclusion part. The aim of the survey is to serve as a referential starting point for newcomers and a common point of agreement around the research directions to be followed in this exciting area
Robotic Burst Imaging for Light-Constrained 3D Reconstruction
This thesis proposes a novel input scheme, robotic burst, to improve vision-based 3D reconstruction for robots operating in low-light conditions, where existing state-of-the-art robotic vision algorithms struggle due to low signal-to-noise ratio in low-light images. We aim to improve the correspondence search stage of feature-based reconstruction using robotic burst imaging, including burst-merged images, a burst feature finder, and an end-to-end learning-based feature extractor. Firstly, we establish the use of robotic burst imaging to compute burst-merged images for feature-based reconstruction. We then develop a burst feature finder that locates features with well-defined scale and apparent motion on a burst to deal with limitations of burst-merged images such as misalignment at strong noise. To improve feature matches in burst-based reconstruction, we also present an end-to-end learning-based feature extractor that finds well-defined scale features directly on light-constrained bursts.
We evaluate our methods against state-of-the-art reconstruction methods for conventional imaging that uses both classical and learning-based feature extractors. We validate our novel input scheme using burst imagery captured on a robotic arm and drones. We demonstrate progressive improvements in low-light reconstruction using our burst-based methods against conventional approaches and overall, converging 90% of all scenes captured in millilux conditions that otherwise converge with 10% success rate using conventional methods. This work opens up new avenues for applications, including autonomous driving and drone delivery at night, mining, and behavioral studies on nocturnal animals
- …