140 research outputs found

    AdaWCT: Adaptive Whitening and Coloring Style Injection

    Full text link
    Adaptive instance normalization (AdaIN) has become the standard method for style injection: by re-normalizing features through scale-and-shift operations, it has found widespread use in style transfer, image generation, and image-to-image translation. In this work, we present a generalization of AdaIN which relies on the whitening and coloring transformation (WCT) which we dub AdaWCT, that we apply for style injection in large GANs. We show, through experiments on the StarGANv2 architecture, that this generalization, albeit conceptually simple, results in significant improvements in the quality of the generated images.Comment: 4 pages + ref

    Deepfake Style Transfer Mixture: a First Forensic Ballistics Study on Synthetic Images

    Full text link
    Most recent style-transfer techniques based on generative architectures are able to obtain synthetic multimedia contents, or commonly called deepfakes, with almost no artifacts. Researchers already demonstrated that synthetic images contain patterns that can determine not only if it is a deepfake but also the generative architecture employed to create the image data itself. These traces can be exploited to study problems that have never been addressed in the context of deepfakes. To this aim, in this paper a first approach to investigate the image ballistics on deepfake images subject to style-transfer manipulations is proposed. Specifically, this paper describes a study on detecting how many times a digital image has been processed by a generative architecture for style transfer. Moreover, in order to address and study accurately forensic ballistics on deepfake images, some mathematical properties of style-transfer operations were investigated

    Data Augmentation Vision Transformer for Fine-grained Image Classification

    Full text link
    Recently, the vision transformer (ViT) has made breakthroughs in image recognition. Its self-attention mechanism (MSA) can extract discriminative labeling information of different pixel blocks to improve image classification accuracy. However, the classification marks in their deep layers tend to ignore local features between layers. In addition, the embedding layer will be fixed-size pixel blocks. Input network Inevitably introduces additional image noise. To this end, we study a data augmentation vision transformer (DAVT) based on data augmentation and proposes a data augmentation method for attention cropping, which uses attention weights as the guide to crop images and improve the ability of the network to learn critical features. Secondly, we also propose a hierarchical attention selection (HAS) method, which improves the ability of discriminative markers between levels of learning by filtering and fusing labels between levels. Experimental results show that the accuracy of this method on the two general datasets, CUB-200-2011, and Stanford Dogs, is better than the existing mainstream methods, and its accuracy is 1.4\% and 1.6\% higher than the original ViT, respectivelyComment: IEEE Signal Processing Letter
    • …
    corecore