160 research outputs found
Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution
to image-to-image translation problems. These networks not only learn the
mapping from input image to output image, but also learn a loss function to
train this mapping. This makes it possible to apply the same generic approach
to problems that traditionally would require very different loss formulations.
We demonstrate that this approach is effective at synthesizing photos from
label maps, reconstructing objects from edge maps, and colorizing images, among
other tasks. Indeed, since the release of the pix2pix software associated with
this paper, a large number of internet users (many of them artists) have posted
their own experiments with our system, further demonstrating its wide
applicability and ease of adoption without the need for parameter tweaking. As
a community, we no longer hand-engineer our mapping functions, and this work
suggests we can achieve reasonable results without hand-engineering our loss
functions either.Comment: Website: https://phillipi.github.io/pix2pix/, CVPR 201
GAN with Skip Patch Discriminator for Biological Electron Microscopy Image Generation
GAN models have been successfully used for image generation in various sections such as real-life objects like human faces, cars, animal faces, landscapes, etc. This work focuses on biological electron microscopy (EM) image generation. Unlike other real-life objects, biological EM images are obtained through electron microscopy techniques to study biological specimens. Electron microscopy offers high resolution and magnification capabilities, making it a powerful tool for visualizing biological structures at the nanoscale. However, using GAN models for biological EM image generation poses challenges due to the complex and unique arrangements of biological structures and the sparse and asymmetrical patterns in EM images, making it difficult for the model to generate realistic images accurately. The patch-based GAN discriminator lacks the capability of simultaneously accessing both the global and local structures of the generated image. The patch discriminator operates at a small receptive field (16x16 patch) in capturing precise local structures while struggling to represent global structures accurately. Conversely, the patch discriminator equipped with large receptive fields (70x70 patch) effectively captures global structures but often fails to reproduce detailed local textures accurately. By addressing the challenges, I have proposed a new discriminator architecture for training GAN models in settings with limited data availability and the presence of both global and local structures to generate realistic biological EM images. The proposed architecture is called a skip patch discriminator
GAN with Skip Patch Discriminator for Biological Electron Microscopy Image Generation
GAN models have been successfully used for image generation in various sections such as real-life objects like human faces, cars, animal faces, landscapes, etc. This work focuses on biological electron microscopy (EM) image generation. Unlike other real-life objects, biological EM images are obtained through electron microscopy techniques to study biological specimens. Electron microscopy offers high resolution and magnification capabilities, making it a powerful tool for visualizing biological structures at the nanoscale. However, using GAN models for biological EM image generation poses challenges due to the complex and unique arrangements of biological structures and the sparse and asymmetrical patterns in EM images, making it difficult for the model to generate realistic images accurately. The patch-based GAN discriminator lacks the capability of simultaneously accessing both the global and local structures of the generated image. The patch discriminator operates at a small receptive field (16x16 patch) in capturing precise local structures while struggling to represent global structures accurately. Conversely, the patch discriminator equipped with large receptive fields (70x70 patch) effectively captures global structures but often fails to reproduce detailed local textures accurately. By addressing the challenges, I have proposed a new discriminator architecture for training GAN models in settings with limited data availability and the presence of both global and local structures to generate realistic biological EM images. The proposed architecture is called a skip patch discriminator
Deep learning based synthesis of MRI, CT and PET:Review and analysis
Medical image synthesis represents a critical area of research in clinical decision-making, aiming to overcome the challenges associated with acquiring multiple image modalities for an accurate clinical workflow. This approach proves beneficial in estimating an image of a desired modality from a given source modality among the most common medical imaging contrasts, such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET). However, translating between two image modalities presents difficulties due to the complex and non-linear domain mappings. Deep learning-based generative modelling has exhibited superior performance in synthetic image contrast applications compared to conventional image synthesis methods. This survey comprehensively reviews deep learning-based medical imaging translation from 2018 to 2023 on pseudo-CT, synthetic MR, and synthetic PET. We provide an overview of synthetic contrasts in medical imaging and the most frequently employed deep learning networks for medical image synthesis. Additionally, we conduct a detailed analysis of each synthesis method, focusing on their diverse model designs based on input domains and network architectures. We also analyse novel network architectures, ranging from conventional CNNs to the recent Transformer and Diffusion models. This analysis includes comparing loss functions, available datasets and anatomical regions, and image quality assessments and performance in other downstream tasks. Finally, we discuss the challenges and identify solutions within the literature, suggesting possible future directions. We hope that the insights offered in this survey paper will serve as a valuable roadmap for researchers in the field of medical image synthesis.</p
An Information-theoretic analysis of generative adversarial networks for image restoration in physics-based vision
Image restoration in physics-based vision (such as image denoising, dehazing, and deraining) are fundamental tasks in computer vision that attach great significance to the processing of visual data as well as subsequent applications in different fields. Existing methods mainly focus on exploring the physical properties and mechanisms of the imaging process, and tend to use a deconstructive idea in describing how the visual degradations (like noise, haze, and rain) are integrated with the background scenes. This idea, however, relies heavily on manually engineered features and handcrafted composition models, which can be theories only in ideal conditions or hypothetical models that may involve human bias or fail in simulating true situations in actual practices. With the progress of representation learning, generative methods, especially generative adversarial networks (GANs), are considered a more promising solution for image restoration tasks. It directly learns the restorations as end-to-end generation processes using large amounts of data without understanding their physical mechanisms, and it also allows completing missing details damaged information by involving external knowledge and generating plausible results with intelligent-level interpretation and semantics-level understanding of the input images. Nevertheless, existing studies that try to apply GAN models to image restoration tasks dose not achieve satisfactory performances compared with the traditional deconstructive methods. And there is scarcely any study or theory to explain how deep generative models work in relevant tasks.
In this study, we analyzed the learning dynamics of different deep generative models based on the information bottleneck principle and propose an information-theoretic framework to explain the generative methods for image restoration tasks. In which, we study the information flow in the image restoration models and point out three sources of information involved in generating the restoration results: (i) high-level information extracted by the encoder network, (ii) low-level information from the source inputs that retained, or pass directed through the skip connections, and, (iii) external information introduced by the learned parameters of the decoder network during the generation process.
Based on this theory, we pointed out that conventional GAN models may not be directly applicable to the tasks of image restoration, and we identify three key issues leading to their performance gaps in the image restoration tasks: (i) over-invested abstraction processes, (ii) inherent details loss, and (iii) imbalance optimization with vanishing gradient. We formulate these problems with corresponding theoretical analyses and provide empirical evidence to verify our hypotheses and prove the existence of these problems respectively.
To address these problems, we then proposed solutions and suggestions including optimizing network structure, enhancing details extraction and accumulation with network modules, as well as replacing measures of training objectives, to improve the performances of GAN models on the image restoration tasks. Ultimately, we verify our solutions on bench-marking datasets and achieve significant improvement on the baseline models
- …
