1,236 research outputs found

    UGC: Unified GAN Compression for Efficient Image-to-Image Translation

    Full text link
    Recent years have witnessed the prevailing progress of Generative Adversarial Networks (GANs) in image-to-image translation. However, the success of these GAN models hinges on ponderous computational costs and labor-expensive training data. Current efficient GAN learning techniques often fall into two orthogonal aspects: i) model slimming via reduced calculation costs; ii)data/label-efficient learning with fewer training data/labels. To combine the best of both worlds, we propose a new learning paradigm, Unified GAN Compression (UGC), with a unified optimization objective to seamlessly prompt the synergy of model-efficient and label-efficient learning. UGC sets up semi-supervised-driven network architecture search and adaptive online semi-supervised distillation stages sequentially, which formulates a heterogeneous mutual learning scheme to obtain an architecture-flexible, label-efficient, and performance-excellent model

    Information-Theoretic GAN Compression with Variational Energy-based Model

    Full text link
    We propose an information-theoretic knowledge distillation approach for the compression of generative adversarial networks, which aims to maximize the mutual information between teacher and student networks via a variational optimization based on an energy-based model. Because the direct computation of the mutual information in continuous domains is intractable, our approach alternatively optimizes the student network by maximizing the variational lower bound of the mutual information. To achieve a tight lower bound, we introduce an energy-based model relying on a deep neural network to represent a flexible variational distribution that deals with high-dimensional images and consider spatial dependencies between pixels, effectively. Since the proposed method is a generic optimization algorithm, it can be conveniently incorporated into arbitrary generative adversarial networks and even dense prediction networks, e.g., image enhancement models. We demonstrate that the proposed algorithm achieves outstanding performance in model compression of generative adversarial networks consistently when combined with several existing models.Comment: Accepted at Neurips202

    CycleGANAS: Differentiable Neural Architecture Search for CycleGAN

    Full text link
    We develop a Neural Architecture Search (NAS) framework for CycleGAN that carries out unpaired image-to-image translation task. Extending previous NAS techniques for Generative Adversarial Networks (GANs) to CycleGAN is not straightforward due to the task difference and greater search space. We design architectures that consist of a stack of simple ResNet-based cells and develop a search method that effectively explore the large search space. We show that our framework, called CycleGANAS, not only effectively discovers high-performance architectures that either match or surpass the performance of the original CycleGAN, but also successfully address the data imbalance by individual architecture search for each translation direction. To our best knowledge, it is the first NAS result for CycleGAN and shed light on NAS for more complex structures

    StarGAN-v2 compression using knowledge distillation

    Get PDF
    Image-to-image translation is used in a broad variety of machine vision and computer graphics applications. These involve mapping grey-scale images to RGB images, deblurring of images, style transfer, transfiguring objects, to name a couple. In addressing complex image-to-image translation issues, Generative Adversarial Networks (GANs) are at the forefront. StarGAN-v2 is a state of the art method for multi-modal multi-domain image-to-image translation that produces different images from a single input image over multiple domains. However, at a parameter count of more than 50M, StarGAN-v2 has a computation bottleneck and consumes more than 60G MACs (Multiply-Accumulate Operations to calculatecomputation expense, 1 MAC = 2 FLOPs) to create one 256×256 image, preventing its widespread adoption. This thesis focuses on the task of compressing StarGAN-v2 using knowledge distillation. Using depthwise separable convolutional layers and reduced channels for intermediate layers, we develop efficient architectures for different StarGAN-v2 modules. In a GAN mini-max optimization environment, the efficient networks are trained with a combination of different distillation losses along with the original objective of StarGAN-v2. Without losing image quality, we reduce the size of the original framework by more than 20× and the computation requirement by more than 5×. The feasibility of the proposed approach is demonstrated by experiments on CelebA-HQ and AFHQ datasets
    • …
    corecore