14 research outputs found

    An Enhancement in Single-Image Dehazing Employing Contrastive Attention over Variational Auto-Encoder (CA-VAE) Method

    Get PDF
    Hazy images and videos have low contrast and poor visibility. Fog, ice fog, steam fog, smoke, volcanic ash, dust, and snow are all terrible conditions for capturing images and worsening color and contrast. Computer vision applications often fail due to image degradation. Hazy images and videos with skewed color contrasts and low visibility affect photometric analysis, object identification, and target tracking. Computer programs can classify and comprehend images using image haze reduction algorithms. Image dehazing now uses deep learning approaches. The observed negative correlation between depth and the difference between the hazy image’s maximum and lowest color channels inspired the suggested study. Using a contrasting attention mechanism spanning sub-pixels and blocks, we offer a unique attention method to create high-quality, haze-free pictures. The L*a*b* color model has been proposed as an effective color space for dehazing images. A variational auto-encoder-based dehazing network may also be utilized for training since it compresses and attempts to reconstruct input images. Estimating hundreds of image-impacting characteristics may be necessary. In a variational auto-encoder, fuzzy input images are directly given a Gaussian probability distribution, and the variational auto-encoder estimates the distribution parameters. A quantitative and qualitative study of the RESIDE dataset will show the suggested method's accuracy and resilience. RESIDE’s subsets of synthetic and real-world single-image dehazing examples are utilized for training and assessment. Enhance the structural similarity index measure (SSIM) and peak signal-to-noise ratio metrics (PSNR)

    Deraining and Desnowing Algorithm on Adaptive Tolerance and Dual-tree Complex Wavelet Fusion

    Get PDF
    Severe weather conditions such as rain and snow often reduce the visual perception quality of the video image system, the traditional methods of deraining and desnowing usually rarely consider adaptive parameters. In order to enhance the effect of video deraining and desnowing, this paper proposes a video deraining and desnowing algorithm based on adaptive tolerance and dual-tree complex wavelet. This algorithm can be widely used in security surveillance, military defense, biological monitoring, remote sensing and other fields. First, this paper introduces the main work of the adaptive tolerance method for the video of dynamic scenes. Second, the algorithm of dual-tree complex wavelet fusion is analyzed and introduced. Using principal component analysis fusion rules to process low-frequency sub-bands, the fusion rule of local energy matching is used to process the high-frequency sub-bands. Finally, this paper used various rain and snow videos to verify the validity and superiority of image reconstruction. Experimental results show that the algorithm has achieved good results in improving the image clarity and restoring the image details obscured by raindrops and snows

    Image Enhancement via Deep Spatial and Temporal Networks

    Get PDF
    Image enhancement is a classic problem in computer vision and has been studied for decades. It includes various subtasks such as super-resolution, image deblurring, rain removal and denoise. Among these tasks, image deblurring and rain removal have become increasingly active, as they play an important role in many areas such as autonomous driving, video surveillance and mobile applications. In addition, there exists connection between them. For example, blur and rain often degrade images simultaneously, and the performance of their removal rely on the spatial and temporal learning. To help generate sharp images and videos, in this thesis, we propose efficient algorithms based on deep neural networks for solving the problems of image deblurring and rain removal. In the first part of this thesis, we study the problem of image deblurring. Four deep learning based image deblurring methods are proposed. First, for single image deblurring, a new framework is presented which firstly learns how to transfer sharp images to realistic blurry images via a learning-to-blur Generative Adversarial Network (GAN) module, and then trains a learning-to-deblur GAN module to learn how to generate sharp images from blurry versions. In contrast to prior work which solely focuses on learning to deblur, the proposed method learns to realistically synthesize blurring effects using unpaired sharp and blurry images. Second, for video deblurring, spatio-temporal learning and adversarial training methods are used to recover sharp and realistic video frames from input blurry versions. 3D convolutional kernels on the basis of deep residual neural networks are employed to capture better spatio-temporal features, and train the proposed network with both the content loss and adversarial loss to drive the model to generate realistic frames. Third, the problem of extracting sharp image sequences from a single motion-blurred image is tackled. A detail-aware network is presented, which is a cascaded generator to handle the problems of ambiguity, subtle motion and loss of details. Finally, this thesis proposes a level-attention deblurring network, and constructs a new large-scale dataset including images with blur caused by various factors. We use this dataset to evaluate current deep deblurring methods and our proposed method. In the second part of this thesis, we study the problem of image deraining. Three deep learning based image deraining methods are proposed. First, for single image deraining, the problem of joint removal of raindrops and rain streaks is tackled. In contrast to most of prior works which solely focus on the raindrops or rain streaks removal, a dual attention-in-attention model is presented, which removes raindrops and rain streaks simultaneously. Second, for video deraining, a novel end-to-end framework is proposed to obtain the spatial representation, and temporal correlations based on ResNet-based and LSTM-based architectures, respectively. The proposed method can generate multiple deraining frames at a time, which outperforms the state-of-the-art methods in terms of quality and speed. Finally, for stereo image deraining, a deep stereo semantic-aware deraining network is proposed for the first time in computer vision. Different from the previous methods which only learn from pixel-level loss function or monocular information, the proposed network advances image deraining by leveraging semantic information and visual deviation between two views

    An Information-theoretic analysis of generative adversarial networks for image restoration in physics-based vision

    Full text link
    Image restoration in physics-based vision (such as image denoising, dehazing, and deraining) are fundamental tasks in computer vision that attach great significance to the processing of visual data as well as subsequent applications in different fields. Existing methods mainly focus on exploring the physical properties and mechanisms of the imaging process, and tend to use a deconstructive idea in describing how the visual degradations (like noise, haze, and rain) are integrated with the background scenes. This idea, however, relies heavily on manually engineered features and handcrafted composition models, which can be theories only in ideal conditions or hypothetical models that may involve human bias or fail in simulating true situations in actual practices. With the progress of representation learning, generative methods, especially generative adversarial networks (GANs), are considered a more promising solution for image restoration tasks. It directly learns the restorations as end-to-end generation processes using large amounts of data without understanding their physical mechanisms, and it also allows completing missing details damaged information by involving external knowledge and generating plausible results with intelligent-level interpretation and semantics-level understanding of the input images. Nevertheless, existing studies that try to apply GAN models to image restoration tasks dose not achieve satisfactory performances compared with the traditional deconstructive methods. And there is scarcely any study or theory to explain how deep generative models work in relevant tasks. In this study, we analyzed the learning dynamics of different deep generative models based on the information bottleneck principle and propose an information-theoretic framework to explain the generative methods for image restoration tasks. In which, we study the information flow in the image restoration models and point out three sources of information involved in generating the restoration results: (i) high-level information extracted by the encoder network, (ii) low-level information from the source inputs that retained, or pass directed through the skip connections, and, (iii) external information introduced by the learned parameters of the decoder network during the generation process. Based on this theory, we pointed out that conventional GAN models may not be directly applicable to the tasks of image restoration, and we identify three key issues leading to their performance gaps in the image restoration tasks: (i) over-invested abstraction processes, (ii) inherent details loss, and (iii) imbalance optimization with vanishing gradient. We formulate these problems with corresponding theoretical analyses and provide empirical evidence to verify our hypotheses and prove the existence of these problems respectively. To address these problems, we then proposed solutions and suggestions including optimizing network structure, enhancing details extraction and accumulation with network modules, as well as replacing measures of training objectives, to improve the performances of GAN models on the image restoration tasks. Ultimately, we verify our solutions on bench-marking datasets and achieve significant improvement on the baseline models

    LEARNING FROM INCOMPLETE AND HETEROGENEOUS DATA

    Get PDF
    Deep convolutional neural networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. However, a vast majority of DCNN-based recognition methods are designed with two key assumptions in mind, i.e., 1) the assumption that all categories are known a priori and 2) both training and test data are drawn from a similar distribution. However, in many real-world applications, these assumptions do not necessarily hold and limit the generalization capability of a recognition model. Generally, incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. If the visual system is trained assuming that all categories are known a priori, it would fail to identify these cases with unknown classes during testing. Ideally, the goal of a visual recognition system would be to reject samples from unknown classes and classify samples from known classes. In this thesis, we consider this constraint and evaluate visual recognition systems under two problem settings, i.e., one-class and multi-class novelty detection. In the one-class setting, the goal is to learn a visual recognition system from a single category and reject any other category samples as unknown during testing. Whereas, in multi-class classification the visual recognition system aims to learn from multiple-categories and reject any other category sample that is not part of the training category set as unknown. With experiments on multiple benchmark datasets we show that the proposed recognition systems are able to perform better compared to existing approaches. Furthermore, we also recognize that in many real world conditions training and testing data distributions are often different. Due to this, the performance of a visual recognition system drops significantly. This is commonly referred to as dataset bias or domain-shift which can be addressed using domain adaptation. In particular, we address unsupervised domain adaptation in which the idea is to utilize an additional set of unlabeled data sampled from a particular domain to help improve the performance in that respective domain. Various experiments on multiple domain adaptation benchmarks show that the proposed strategy is able to generalize better compared to existing methods in the literature

    DEEP LEARNING-BASED APPROACHES FOR IMAGE RESTORATION

    Get PDF
    Image restoration is the operation of taking a corrupted or degraded low-quality image and estimating a high-quality clean image that is free of degradations. The most common degradations that affect the quality of the image are blur, atmospheric turbulence, adverse weather conditions (like rain, haze, and snow), and noise. Images captured under the influence of these corruptions or degradations can significantly affect the performance of subsequent computer vision algorithms such as segmentation, recognition, object detection, and tracking. With such algorithms becoming vital components in several applications such as autonomous navigation and video surveillance, it is increasingly important to develop sophisticated algorithms to remove these degradations and high-quality clean images. These reasons have motivated a plethora of research on single image restoration methods to remove such effects. Recently, following the success of deep learning-based convolutional neural networks, many approaches have been proposed to remove the degradations from the corrupted image. We study the following single image restoration problems: (i) atmospheric turbulence removal, (ii) deblurring, (iii) removing distortions introduced by adverse weather conditions such as rain, haze, and snow, and (iv) removing noise. However, existing single image restoration techniques suffer from the following major limitations: (i) They construct global priors without taking into account that these degradations can have a different effect on different local regions of the image. (ii) They use synthetic datasets for training which often results in sub-optimal performance on the real-world images, typically because of the distributional-shift between synthetic and real-world degraded images. (iii) Existing semi-supervised approaches don't account for the effect of unlabeled or real-world degraded image on semi-supervised performance. To address these limitations, we propose supervised image restoration techniques where we use uncertainty to improve the restoration performance. To overcome the second limitation, we propose a Gaussian process-based pseudo-labeling approach to leverage the real-world rain information and train the deraininng network in a semi-supervised fashion. Furthermore, to address the third limitation we theoretically study the effect of unlabeled images on semi-supervised performance and propose an adaptive rejection technique to boost semi-supervised performance. Finally, we recognize that existing supervised and semi-supervised methods need some kind of paired labeled data to train the network, and training on any kind of synthetic paired clean-degraded images may not completely solve the domain gap between synthetic and real-world degraded image distributions. Thus we propose a self-supervised transformer-based approach for image denoising. Here, given a noisy image, we generate multiple down-sampled images and learn the joint relation between these down-sampled using the Gaussian process to denoise the image
    corecore