630 research outputs found

    Depth Estimation and Image Restoration by Deep Learning from Defocused Images

    Full text link
    Monocular depth estimation and image deblurring are two fundamental tasks in computer vision, given their crucial role in understanding 3D scenes. Performing any of them by relying on a single image is an ill-posed problem. The recent advances in the field of Deep Convolutional Neural Networks (DNNs) have revolutionized many tasks in computer vision, including depth estimation and image deblurring. When it comes to using defocused images, the depth estimation and the recovery of the All-in-Focus (Aif) image become related problems due to defocus physics. Despite this, most of the existing models treat them separately. There are, however, recent models that solve these problems simultaneously by concatenating two networks in a sequence to first estimate the depth or defocus map and then reconstruct the focused image based on it. We propose a DNN that solves the depth estimation and image deblurring in parallel. Our Two-headed Depth Estimation and Deblurring Network (2HDED:NET) extends a conventional Depth from Defocus (DFD) networks with a deblurring branch that shares the same encoder as the depth branch. The proposed method has been successfully tested on two benchmarks, one for indoor and the other for outdoor scenes: NYU-v2 and Make3D. Extensive experiments with 2HDED:NET on these benchmarks have demonstrated superior or close performances to those of the state-of-the-art models for depth estimation and image deblurring

    End-to-end Alternating Optimization for Real-World Blind Super Resolution

    Full text link
    Blind Super-Resolution (SR) usually involves two sub-problems: 1) estimating the degradation of the given low-resolution (LR) image; 2) super-resolving the LR image to its high-resolution (HR) counterpart. Both problems are ill-posed due to the information loss in the degrading process. Most previous methods try to solve the two problems independently, but often fall into a dilemma: a good super-resolved HR result requires an accurate degradation estimation, which however, is difficult to be obtained without the help of original HR information. To address this issue, instead of considering these two problems independently, we adopt an alternating optimization algorithm, which can estimate the degradation and restore the SR image in a single model. Specifically, we design two convolutional neural modules, namely \textit{Restorer} and \textit{Estimator}. \textit{Restorer} restores the SR image based on the estimated degradation, and \textit{Estimator} estimates the degradation with the help of the restored SR image. We alternate these two modules repeatedly and unfold this process to form an end-to-end trainable network. In this way, both \textit{Restorer} and \textit{Estimator} could get benefited from the intermediate results of each other, and make each sub-problem easier. Moreover, \textit{Restorer} and \textit{Estimator} are optimized in an end-to-end manner, thus they could get more tolerant of the estimation deviations of each other and cooperate better to achieve more robust and accurate final results. Extensive experiments on both synthetic datasets and real-world images show that the proposed method can largely outperform state-of-the-art methods and produce more visually favorable results. The codes are rleased at \url{https://github.com/greatlog/RealDAN.git}.Comment: Extension of our previous NeurIPS paper. Accepted to IJC

    Learning A Coarse-to-Fine Diffusion Transformer for Image Restoration

    Full text link
    Recent years have witnessed the remarkable performance of diffusion models in various vision tasks. However, for image restoration that aims to recover clear images with sharper details from given degraded observations, diffusion-based methods may fail to recover promising results due to inaccurate noise estimation. Moreover, simple constraining noises cannot effectively learn complex degradation information, which subsequently hinders the model capacity. To solve the above problems, we propose a coarse-to-fine diffusion Transformer (C2F-DFT) for image restoration. Specifically, our C2F-DFT contains diffusion self-attention (DFSA) and diffusion feed-forward network (DFN) within a new coarse-to-fine training scheme. The DFSA and DFN respectively capture the long-range diffusion dependencies and learn hierarchy diffusion representation to facilitate better restoration. In the coarse training stage, our C2F-DFT estimates noises and then generates the final clean image by a sampling algorithm. To further improve the restoration quality, we propose a simple yet effective fine training scheme. It first exploits the coarse-trained diffusion model with fixed steps to generate restoration results, which then would be constrained with corresponding ground-truth ones to optimize the models to remedy the unsatisfactory results affected by inaccurate noise estimation. Extensive experiments show that C2F-DFT significantly outperforms diffusion-based restoration method IR-SDE and achieves competitive performance compared with Transformer-based state-of-the-art methods on 33 tasks, including deraining, deblurring, and real denoising.Comment: 9 pages, 8 figure

    Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising

    Full text link
    Image denoising is a fundamental and challenging task in the field of computer vision. Most supervised denoising methods learn to reconstruct clean images from noisy inputs, which have intrinsic spectral bias and tend to produce over-smoothed and blurry images. Recently, researchers have explored diffusion models to generate high-frequency details in image restoration tasks, but these models do not guarantee that the generated texture aligns with real images, leading to undesirable artifacts. To address the trade-off between visual appeal and fidelity of high-frequency details in denoising tasks, we propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG). Our method leverages a reconstructive denoising network to recover the majority of the underlying clean signal, which serves as the initial estimation for subsequent steps to maintain fidelity. Additionally, it employs a diffusion algorithm to generate residual high-frequency details, thereby enhancing visual quality. We further introduce a two-stage training scheme to ensure effective collaboration between the reconstructive and generative modules of RnG. To reduce undesirable texture introduced by the diffusion model, we also propose an adaptive step controller that regulates the number of inverse steps applied by the diffusion model, allowing control over the level of high-frequency details added to each patch as well as saving the inference computational cost. Through our proposed RnG, we achieve a better balance between perception and distortion. We conducted extensive experiments on both synthetic and real denoising datasets, validating the superiority of the proposed approach

    Enhancing Image Quality: A Comparative Study of Spatial, Frequency Domain, and Deep Learning Methods

    Get PDF
    Image restoration and noise reduction methods have been created to restore deteriorated images and improve their quality. These methods have garnered substantial significance in recent times, mainly due to the growing utilization of digital imaging across diverse domains, including but not limited to medical imaging, surveillance, satellite imaging, and numerous others. In this paper, we conduct a comparative analysis of three distinct approaches to image restoration: the spatial method, the frequency domain method, and the deep learning method. The study was conducted on a dataset of 10,000 images, and the performance of each method was evaluated using the accuracy and loss metrics. The results show that the deep learning method outperformed the other two methods, achieving a validation accuracy of 72.68% after 10 epochs. The spatial method had the lowest accuracy of the three, achieving a validation accuracy of 69.98% after 10 epochs. The FFT frequency domain method had a validation accuracy of 52.87% after 10 epochs, significantly lower than the other two methods. The study demonstrates that deep learning is a promising approach for image classification tasks and outperforms traditional methods such as spatial and frequency domain techniques

    PromptIR: Prompting for All-in-One Blind Image Restoration

    Full text link
    Image restoration involves recovering a high-quality clean image from its degraded version. Deep learning-based methods have significantly improved image restoration performance, however, they have limited generalization ability to different degradation types and levels. This restricts their real-world application since it requires training individual models for each specific degradation and knowing the input degradation type to apply the relevant model. We present a prompt-based learning approach, PromptIR, for All-In-One image restoration that can effectively restore images from various types and levels of degradation. In particular, our method uses prompts to encode degradation-specific information, which is then used to dynamically guide the restoration network. This allows our method to generalize to different degradation types and levels, while still achieving state-of-the-art results on image denoising, deraining, and dehazing. Overall, PromptIR offers a generic and efficient plugin module with few lightweight prompts that can be used to restore images of various types and levels of degradation with no prior information on the corruptions present in the image. Our code and pretrained models are available here: https://github.com/va1shn9v/PromptI

    Panchromatic and multispectral image fusion for remote sensing and earth observation: Concepts, taxonomy, literature review, evaluation methodologies and challenges ahead

    Get PDF
    Panchromatic and multispectral image fusion, termed pan-sharpening, is to merge the spatial and spectral information of the source images into a fused one, which has a higher spatial and spectral resolution and is more reliable for downstream tasks compared with any of the source images. It has been widely applied to image interpretation and pre-processing of various applications. A large number of methods have been proposed to achieve better fusion results by considering the spatial and spectral relationships among panchromatic and multispectral images. In recent years, the fast development of artificial intelligence (AI) and deep learning (DL) has significantly enhanced the development of pan-sharpening techniques. However, this field lacks a comprehensive overview of recent advances boosted by the rise of AI and DL. This paper provides a comprehensive review of a variety of pan-sharpening methods that adopt four different paradigms, i.e., component substitution, multiresolution analysis, degradation model, and deep neural networks. As an important aspect of pan-sharpening, the evaluation of the fused image is also outlined to present various assessment methods in terms of reduced-resolution and full-resolution quality measurement. Then, we conclude this paper by discussing the existing limitations, difficulties, and challenges of pan-sharpening techniques, datasets, and quality assessment. In addition, the survey summarizes the development trends in these areas, which provide useful methodological practices for researchers and professionals. Finally, the developments in pan-sharpening are summarized in the conclusion part. The aim of the survey is to serve as a referential starting point for newcomers and a common point of agreement around the research directions to be followed in this exciting area

    Robotic Burst Imaging for Light-Constrained 3D Reconstruction

    Get PDF
    This thesis proposes a novel input scheme, robotic burst, to improve vision-based 3D reconstruction for robots operating in low-light conditions, where existing state-of-the-art robotic vision algorithms struggle due to low signal-to-noise ratio in low-light images. We aim to improve the correspondence search stage of feature-based reconstruction using robotic burst imaging, including burst-merged images, a burst feature finder, and an end-to-end learning-based feature extractor. Firstly, we establish the use of robotic burst imaging to compute burst-merged images for feature-based reconstruction. We then develop a burst feature finder that locates features with well-defined scale and apparent motion on a burst to deal with limitations of burst-merged images such as misalignment at strong noise. To improve feature matches in burst-based reconstruction, we also present an end-to-end learning-based feature extractor that finds well-defined scale features directly on light-constrained bursts. We evaluate our methods against state-of-the-art reconstruction methods for conventional imaging that uses both classical and learning-based feature extractors. We validate our novel input scheme using burst imagery captured on a robotic arm and drones. We demonstrate progressive improvements in low-light reconstruction using our burst-based methods against conventional approaches and overall, converging 90% of all scenes captured in millilux conditions that otherwise converge with 10% success rate using conventional methods. This work opens up new avenues for applications, including autonomous driving and drone delivery at night, mining, and behavioral studies on nocturnal animals
    corecore