35 research outputs found
Style Transfer with Generative Adversarial Networks
This dissertation is focused on trying to use concepts from style transfer and image-to-image translation to address the problem of defogging. Defogging (or dehazing) is the ability to remove fog from an image, restoring it as if the photograph was taken during optimal weather conditions. The task of defogging is of particular interest in many fields, such as surveillance or self driving cars.
In this thesis an unpaired approach to defogging is adopted, trying to translate a foggy image to the correspondent clear picture without having pairs of foggy and ground truth haze-free images during training. This approach is particularly significant, due to the difficult of gathering an image collection of exactly the same scenes with and without fog.
Many of the models and techniques used in this dissertation already existed in literature, but they are extremely difficult to train, and often it is highly problematic to obtain the desired behavior. Our contribute was a systematic implementative and experimental activity, conducted with the aim of attaining a comprehensive understanding of how these models work, and the role of datasets and training procedures in the final results. We also analyzed metrics and evaluation strategies, in order to seek to assess the quality of the presented model in the most correct and appropriate manner.
First, the feasibility of an unpaired approach to defogging was analyzed, using the cycleGAN model. Then, the base model was enhanced with a cycle perceptual loss, inspired by style transfer techniques. Next, the role of the training set was investigated, showing that improving the quality of data is at least as important as the utilization of more powerful models. Finally, our approach is compared with state-of-the art defogging methods, showing that the quality of our results is in line with preexisting approaches, even if our model was trained using unpaired data
Quantum defogging: temporal photon number fluctuation correlation in time-variant fog scattering medium
The conventional McCartney model simplifies fog as a scattering medium with
space-time invariance, as the time-variant nature of fog is a pure noise for
classical optical imaging. In this letter, an opposite finding to traditional
idea is reported. The time parameter is incorporated into the McCartney model
to account for photon number fluctuation introduced by time-variant fog. We
demonstrated that the randomness of ambient photons in the time domain results
in the absence of a stable correlation, while the scattering photons are the
opposite. This difference can be measured by photon number fluctuation
correlation when two conditions are met. A defogging image is reconstructed
from the target's information carried by scattering light. Thus, the noise
introduced by time-variant fog is eliminated by itself. Distinguishable images
can be obtained even when the target is indistinguishable by conventional
cameras, providing a prerequisite for subsequent high-level computer vision
tasks.Comment: 6 pages, 9 figure
Non-aligned supervision for Real Image Dehazing
Removing haze from real-world images is challenging due to unpredictable
weather conditions, resulting in misaligned hazy and clear image pairs. In this
paper, we propose a non-aligned supervision framework that consists of three
networks - dehazing, airlight, and transmission. In particular, we explore a
non-alignment setting by utilizing a clear reference image that is not aligned
with the hazy input image to supervise the dehazing network through a
multi-scale reference loss that compares the features of the two images. Our
setting makes it easier to collect hazy/clear image pairs in real-world
environments, even under conditions of misalignment and shift views. To
demonstrate this, we have created a new hazy dataset called "Phone-Hazy", which
was captured using mobile phones in both rural and urban areas. Additionally,
we present a mean and variance self-attention network to model the infinite
airlight using dark channel prior as position guidance, and employ a channel
attention network to estimate the three-channel transmission. Experimental
results show that our framework outperforms current state-of-the-art methods in
the real-world image dehazing. Phone-Hazy and code will be available at
https://github.com/hello2377/NSDNet
STEREOFOG -- Computational DeFogging via Image-to-Image Translation on a real-world Dataset
Image-to-Image translation (I2I) is a subtype of Machine Learning (ML) that
has tremendous potential in applications where two domains of images and the
need for translation between the two exist, such as the removal of fog. For
example, this could be useful for autonomous vehicles, which currently struggle
with adverse weather conditions like fog. However, datasets for I2I tasks are
not abundant and typically hard to acquire. Here, we introduce STEREOFOG, a
dataset comprised of paired fogged and clear images, captured using a
custom-built device, with the purpose of exploring I2I's potential in this
domain. It is the only real-world dataset of this kind to the best of our
knowledge. Furthermore, we apply and optimize the pix2pix I2I ML framework to
this dataset. With the final model achieving an average Complex
Wavelet-Structural Similarity (CW-SSIM) score of , we prove the
technique's suitability for the problem.Comment: 7 pages, 7 figures, for associated dataset and Supplement file, see
https://github.com/apoll2000/stereofo
De-smokeGCN: Generative Cooperative Networks for Joint Surgical Smoke Detection and Removal
Surgical smoke removal algorithms can improve the quality of intra-operative imaging and reduce hazards in image-guided surgery, a highly desirable post-process for many clinical applications. These algorithms also enable effective computer vision tasks for future robotic surgery. In this paper, we present a new unsupervised learning framework for high-quality pixel-wise smoke detection and removal. One of the well recognized grand challenges in using convolutional neural networks (CNNs) for medical image processing is to obtain intra-operative medical imaging datasets for network training and validation, but availability and quality of these datasets are scarce. Our novel training framework does not require ground-truth image pairs. Instead, it learns purely from computer-generated simulation images. This approach opens up new avenues and bridges a substantial gap between conventional non-learning based methods and which requiring prior knowledge gained from extensive training datasets. Inspired by the Generative Adversarial Network (GAN), we have developed a novel generative-collaborative learning scheme that decomposes the de-smoke process into two separate tasks: smoke detection and smoke removal. The detection network is used as prior knowledge, and also as a loss function to maximize its support for training of the smoke removal network. Quantitative and qualitative studies show that the proposed training framework outperforms the state-of-the-art de-smoking approaches including the latest GAN framework (such as PIX2PIX). Although trained on synthetic images, experimental results on clinical images have proved the effectiveness of the proposed network for detecting and removing surgical smoke on both simulated and real-world laparoscopic images