1,236 research outputs found
UGC: Unified GAN Compression for Efficient Image-to-Image Translation
Recent years have witnessed the prevailing progress of Generative Adversarial
Networks (GANs) in image-to-image translation. However, the success of these
GAN models hinges on ponderous computational costs and labor-expensive training
data. Current efficient GAN learning techniques often fall into two orthogonal
aspects: i) model slimming via reduced calculation costs;
ii)data/label-efficient learning with fewer training data/labels. To combine
the best of both worlds, we propose a new learning paradigm, Unified GAN
Compression (UGC), with a unified optimization objective to seamlessly prompt
the synergy of model-efficient and label-efficient learning. UGC sets up
semi-supervised-driven network architecture search and adaptive online
semi-supervised distillation stages sequentially, which formulates a
heterogeneous mutual learning scheme to obtain an architecture-flexible,
label-efficient, and performance-excellent model
Information-Theoretic GAN Compression with Variational Energy-based Model
We propose an information-theoretic knowledge distillation approach for the
compression of generative adversarial networks, which aims to maximize the
mutual information between teacher and student networks via a variational
optimization based on an energy-based model. Because the direct computation of
the mutual information in continuous domains is intractable, our approach
alternatively optimizes the student network by maximizing the variational lower
bound of the mutual information. To achieve a tight lower bound, we introduce
an energy-based model relying on a deep neural network to represent a flexible
variational distribution that deals with high-dimensional images and consider
spatial dependencies between pixels, effectively. Since the proposed method is
a generic optimization algorithm, it can be conveniently incorporated into
arbitrary generative adversarial networks and even dense prediction networks,
e.g., image enhancement models. We demonstrate that the proposed algorithm
achieves outstanding performance in model compression of generative adversarial
networks consistently when combined with several existing models.Comment: Accepted at Neurips202
CycleGANAS: Differentiable Neural Architecture Search for CycleGAN
We develop a Neural Architecture Search (NAS) framework for CycleGAN that
carries out unpaired image-to-image translation task. Extending previous NAS
techniques for Generative Adversarial Networks (GANs) to CycleGAN is not
straightforward due to the task difference and greater search space. We design
architectures that consist of a stack of simple ResNet-based cells and develop
a search method that effectively explore the large search space. We show that
our framework, called CycleGANAS, not only effectively discovers
high-performance architectures that either match or surpass the performance of
the original CycleGAN, but also successfully address the data imbalance by
individual architecture search for each translation direction. To our best
knowledge, it is the first NAS result for CycleGAN and shed light on NAS for
more complex structures
StarGAN-v2 compression using knowledge distillation
Image-to-image translation is used in a broad variety of machine vision and computer graphics applications. These involve mapping grey-scale images to RGB images, deblurring of images, style transfer, transfiguring objects, to name a couple. In addressing complex image-to-image translation issues, Generative Adversarial Networks (GANs) are at the forefront. StarGAN-v2 is a state of the art method for multi-modal multi-domain image-to-image translation that produces different images from a single input image over multiple domains. However, at a parameter count of more than 50M, StarGAN-v2 has a computation bottleneck and consumes more than 60G MACs (Multiply-Accumulate Operations to calculatecomputation expense, 1 MAC = 2 FLOPs) to create one 256×256 image, preventing its widespread adoption. This thesis focuses on the task of compressing StarGAN-v2 using knowledge distillation. Using depthwise separable convolutional layers and reduced channels for intermediate layers, we develop efficient architectures for different StarGAN-v2 modules. In a GAN mini-max optimization environment, the efficient networks are trained with a combination of different distillation losses along with the original objective of StarGAN-v2. Without losing image quality, we reduce the size of the original framework by more than 20× and the computation requirement by more than 5×. The feasibility of the proposed approach is demonstrated by experiments on CelebA-HQ and AFHQ datasets
- …