738 research outputs found
Recent Progress in Image Deblurring
This paper comprehensively reviews the recent development of image
deblurring, including non-blind/blind, spatially invariant/variant deblurring
techniques. Indeed, these techniques share the same objective of inferring a
latent sharp image from one or several corresponding blurry images, while the
blind deblurring techniques are also required to derive an accurate blur
kernel. Considering the critical role of image restoration in modern imaging
systems to provide high-quality images under complex environments such as
motion, undesirable lighting conditions, and imperfect system components, image
deblurring has attracted growing attention in recent years. From the viewpoint
of how to handle the ill-posedness which is a crucial issue in deblurring
tasks, existing methods can be grouped into five categories: Bayesian inference
framework, variational methods, sparse representation-based methods,
homography-based modeling, and region-based methods. In spite of achieving a
certain level of development, image deblurring, especially the blind case, is
limited in its success by complex application conditions which make the blur
kernel hard to obtain and be spatially variant. We provide a holistic
understanding and deep insight into image deblurring in this review. An
analysis of the empirical evidence for representative methods, practical
issues, as well as a discussion of promising future directions are also
presented.Comment: 53 pages, 17 figure
Focusing on out-of-focus : assessing defocus estimation algorithms for the benefit of automated image masking
Acquiring photographs as input for an image-based modelling pipeline is less trivial than often assumed. Photographs should be correctly exposed, cover the subject sufficiently from all possible angles, have the required spatial resolution, be devoid of any motion blur, exhibit accurate focus and feature an adequate depth of field. The last four characteristics all determine the " sharpness " of an image and the photogrammetric, computer vision and hybrid photogrammetric computer vision communities all assume that the object to be modelled is depicted " acceptably " sharp throughout the whole image collection. Although none of these three fields has ever properly quantified " acceptably sharp " , it is more or less standard practice to mask those image portions that appear to be unsharp due to the limited depth of field around the plane of focus (whether this means blurry object parts or completely out-of-focus backgrounds). This paper will assess how well-or ill-suited defocus estimating algorithms are for automatically masking a series of photographs, since this could speed up modelling pipelines with many hundreds or thousands of photographs. To that end, the paper uses five different real-world datasets and compares the output of three state-of-the-art edge-based defocus estimators. Afterwards, critical comments and plans for the future finalise this paper
DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with Transformer
Recent works achieve excellent results in defocus deblurring task based on
dual-pixel data using convolutional neural network (CNN), while the scarcity of
data limits the exploration and attempt of vision transformer in this task. In
addition, the existing works use fixed parameters and network architecture to
deblur images with different distribution and content information, which also
affects the generalization ability of the model. In this paper, we propose a
dynamic multi-scale network, named DMTNet, for dual-pixel images defocus
deblurring. DMTNet mainly contains two modules: feature extraction module and
reconstruction module. The feature extraction module is composed of several
vision transformer blocks, which uses its powerful feature extraction
capability to obtain richer features and improve the robustness of the model.
The reconstruction module is composed of several Dynamic Multi-scale
Sub-reconstruction Module (DMSSRM). DMSSRM can restore images by adaptively
assigning weights to features from different scales according to the blur
distribution and content information of the input images. DMTNet combines the
advantages of transformer and CNN, in which the vision transformer improves the
performance ceiling of CNN, and the inductive bias of CNN enables transformer
to extract more robust features without relying on a large amount of data.
DMTNet might be the first attempt to use vision transformer to restore the
blurring images to clarity. By combining with CNN, the vision transformer may
achieve better performance on small datasets. Experimental results on the
popular benchmarks demonstrate that our DMTNet significantly outperforms
state-of-the-art methods
LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network
Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent
blur is a challenging task.~Existing blur map-based deblurring methods have
demonstrated promising results. In this paper, we propose, to the best of our
knowledge, the first framework to introduce the contrastive language-image
pre-training framework (CLIP) to achieve accurate blur map estimation from DP
pairs unsupervisedly. To this end, we first carefully design text prompts to
enable CLIP to understand blur-related geometric prior knowledge from the DP
pair. Then, we propose a format to input stereo DP pair to the CLIP without any
fine-tuning, where the CLIP is pre-trained on monocular images. Given the
estimated blur map, we introduce a blur-prior attention block, a blur-weighting
loss and a blur-aware loss to recover the all-in-focus image. Our method
achieves state-of-the-art performance in extensive experiments
Learnable Blur Kernel for Single-Image Defocus Deblurring in the Wild
Recent research showed that the dual-pixel sensor has made great progress in
defocus map estimation and image defocus deblurring. However, extracting
real-time dual-pixel views is troublesome and complex in algorithm deployment.
Moreover, the deblurred image generated by the defocus deblurring network lacks
high-frequency details, which is unsatisfactory in human perception. To
overcome this issue, we propose a novel defocus deblurring method that uses the
guidance of the defocus map to implement image deblurring. The proposed method
consists of a learnable blur kernel to estimate the defocus map, which is an
unsupervised method, and a single-image defocus deblurring generative
adversarial network (DefocusGAN) for the first time. The proposed network can
learn the deblurring of different regions and recover realistic details. We
propose a defocus adversarial loss to guide this training process. Competitive
experimental results confirm that with a learnable blur kernel, the generated
defocus map can achieve results comparable to supervised methods. In the
single-image defocus deblurring task, the proposed method achieves
state-of-the-art results, especially significant improvements in perceptual
quality, where PSNR reaches 25.56 dB and LPIPS reaches 0.111.Comment: 9 pages, 7 figure
λμΌ ν½μ μ΄λ―Έμ§ κΈ°λ° μ λ‘μ· λν¬μ»€μ€ λλΈλ¬λ§
νμλ
Όλ¬Έ(μμ¬) -- μμΈλνκ΅λνμ : 곡과λν νλκ³Όμ μΈκ³΅μ§λ₯μ 곡, 2022. 8. ν보ν.Defocus deblurring in dual-pixel (DP) images is a challenging problem due to diverse camera optics and scene structures. Most of the existing algorithms rely on supervised learning approaches trained on the Canon DSLR dataset but often suffer from weak generalizability to out-of-distribution images including the ones captured by smartphones. We propose a novel zero-shot defocus deblurring algorithm, which only requires a pair of DP images without any training data and a pre-calibrated ground-truth blur kernel. Specifically, our approach first initializes a sharp latent map using a parametric blur kernel with a symmetry constraint. It then uses a convolutional neural network (CNN) to estimate the defocus map that best describes the observed DP image. Finally, it employs a generative model to learn scene-specific non-uniform blur kernels to compute the final enhanced images. We demonstrate that the proposed unsupervised technique outperforms the counterparts based on supervised learning when training and testing run in different datasets. We also present that our model achieves competitive accuracy when tested on in-distribution data.λμΌ ν½μ
(DP) μ΄λ―Έμ§ μΌμλ₯Ό μ¬μ©νλ μ€λ§νΈν°μμμ Defocus Blur νμμ λ€μν μΉ΄λ©λΌ κ΄ν ꡬ쑰μ 물체μ κΉμ΄ λ§λ€ λ€λ₯Έ νλ¦Ών¨ μ λλ‘ μΈν΄ μ μμ 볡μμ΄ μ½μ§ μμ΅λλ€. κΈ°μ‘΄ μκ³ λ¦¬μ¦λ€μ λͺ¨λ Canon DSLR λ°μ΄ν°μμ νλ ¨λ μ§λ νμ΅ μ κ·Ό λ°©μμ μμ‘΄νμ¬ μ€λ§νΈν°μΌλ‘ 촬μλ μ¬μ§μμλ μ μΌλ°νκ° λμ§ μμ΅λλ€. λ³Έ λ
Όλ¬Έμμλ νλ ¨ λ°μ΄ν°μ μ¬μ 보μ λ μ€μ Blur 컀λ μμ΄λ, ν μμ DP μ¬μ§λ§μΌλ‘λ νμ΅μ΄ κ°λ₯ν Zero-shot Defocus Deblurring μκ³ λ¦¬μ¦μ μ μν©λλ€. νΉν, λ³Έ λ
Όλ¬Έμμλ λμΉμ μΌλ‘ λͺ¨λΈλ§ λ Blur Kernelμ μ¬μ©νμ¬ μ΄κΈ° μμμ 볡μνλ©°, μ΄ν CNN(Convolutional Neural Network)μ μ¬μ©νμ¬ κ΄μ°°λ DP μ΄λ―Έμ§λ₯Ό κ°μ₯ μ μ€λͺ
νλ Defocus Mapμ μΆμ ν©λλ€. λ§μ§λ§μΌλ‘ CNNμ μ¬μ©νμ¬ μ₯λ©΄ λ³ Non-uniformν Blur Kernelμ νμ΅νμ¬ μ΅μ’
볡μ μμμ μ±λ₯μ κ°μ ν©λλ€. νμ΅κ³Ό μΆλ‘ μ΄ λ€λ₯Έ λ°μ΄ν° μΈνΈμμ μ€νλ λ, μ μλ λ°©λ²μ λΉμ§λ κΈ°μ μμλ λΆκ΅¬νκ³ μ΅κ·Όμ λ°νλ μ§λ νμ΅μ κΈ°λ°μ λ°©λ²λ€λ³΄λ€ μ°μν μ±λ₯μ 보μ¬μ€λλ€. λν νμ΅ λ κ²κ³Ό κ°μ λΆν¬ λ΄ λ°μ΄ν°μμ μΆλ‘ ν λλ μ§λ νμ΅ κΈ°λ°μ λ°©λ²λ€κ³Ό μ λμ λλ μ μ±μ μΌλ‘ λΉμ·ν μ±λ₯μ 보μ΄λ κ²μ νμΈν μ μμμ΅λλ€.1. Introduction 6
1.1. Background 6
1.2. Overview 9
1.3. Contribution 11
2. Related Works 12
2.1.Defocus Deblurring 12
2.2.Defocus Map 13
2.3.Multiplane Image Representation 14
2.4.DP Blur Kernel 14
3. Proposed Methods 16
3.1. Latent Map Initialization 17
3.2. Defocus Map Estimation 20
3.3. Learning Blur Kernel s 22
3.4. Implementation Details 25
4. Experiments 28
4.1. Dataset 28
4.2. Quantitative Results 29
4.3. Qualitative Results 31
5. Conclusions 37
5.1.Summary 37
5.2. Discussion 38μ
- β¦