18,722 research outputs found

    High-resolution Deep Convolutional Generative Adversarial Networks

    Full text link
    Generative Adversarial Networks (GANs) [Goodfellow et al. 2014] convergence in a high-resolution setting with a computational constrain of GPU memory capacity has been beset with difficulty due to the known lack of convergence rate stability. In order to boost network convergence of DCGAN (Deep Convolutional Generative Adversarial Networks) [Radford et al. 2016] and achieve good-looking high-resolution results we propose a new layered network, HDCGAN, that incorporates current state-of-the-art techniques for this effect. Glasses, a mechanism to arbitrarily improve the final GAN generated results by enlarging the input size by a telescope {\zeta} is also presented. A novel bias-free dataset, Curt\'o & Zarza, containing human faces from different ethnical groups in a wide variety of illumination conditions and image resolutions is introduced. Curt\'o is enhanced with HDCGAN synthetic images, thus being the first GAN augmented dataset of faces. We conduct extensive experiments on CelebA [Liu et al. 2015], CelebA-hq [Karras et al. 2018] and Curt\'o. HDCGAN is the current state-of-the-art in synthetic image generation on CelebA achieving a MS-SSIM of 0.1978 and a FR\'ECHET Inception Distance of 8.44

    TGAN: Deep Tensor Generative Adversarial Nets for Large Image Generation

    Full text link
    Deep generative models have been successfully applied to many applications. However, existing works experience limitations when generating large images (the literature usually generates small images, e.g. 32 * 32 or 128 * 128). In this paper, we propose a novel scheme, called deep tensor adversarial generative nets (TGAN), that generates large high-quality images by exploring tensor structures. Essentially, the adversarial process of TGAN takes place in a tensor space. First, we impose tensor structures for concise image representation, which is superior in capturing the pixel proximity information and the spatial patterns of elementary objects in images, over the vectorization preprocess in existing works. Secondly, we propose TGAN that integrates deep convolutional generative adversarial networks and tensor super-resolution in a cascading manner, to generate high-quality images from random distributions. More specifically, we design a tensor super-resolution process that consists of tensor dictionary learning and tensor coefficients learning. Finally, on three datasets, the proposed TGAN generates images with more realistic textures, compared with state-of-the-art adversarial autoencoders. The size of the generated images is increased by over 8.5 times, namely 374 * 374 in PASCAL2

    Megapixel Size Image Creation using Generative Adversarial Networks

    Full text link
    Since its appearance, Generative Adversarial Networks (GANs) have received a lot of interest in the AI community. In image generation several projects showed how GANs are able to generate photorealistic images but the results so far did not look adequate for the quality standard of visual media production industry. We present an optimized image generation process based on a Deep Convolutional Generative Adversarial Networks (DCGANs), in order to create photorealistic high-resolution images (up to 1024x1024 pixels). Furthermore, the system was fed with a limited dataset of images, less than two thousand images. All these results give more clue about future exploitation of GANs in Computer Graphics and Visual Effects.Comment: 3 pages, 4 figure

    High-Quality Face Image SR Using Conditional Generative Adversarial Networks

    Full text link
    We propose a novel single face image super-resolution method, which named Face Conditional Generative Adversarial Network(FCGAN), based on boundary equilibrium generative adversarial networks. Without taking any facial prior information, our method can generate a high-resolution face image from a low-resolution one. Compared with existing studies, both our training and testing phases are end-to-end pipeline with little pre/post-processing. To enhance the convergence speed and strengthen feature propagation, skip-layer connection is further employed in the generative and discriminative networks. Extensive experiments demonstrate that our model achieves competitive performance compared with state-of-the-art models.Comment: 9 pages, 4 figure

    Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks

    Full text link
    Recent advances in convolutional neural networks have shown promising results in 3D shape completion. But due to GPU memory limitations, these methods can only produce low-resolution outputs. To inpaint 3D models with semantic plausibility and contextual details, we introduce a hybrid framework that combines a 3D Encoder-Decoder Generative Adversarial Network (3D-ED-GAN) and a Long-term Recurrent Convolutional Network (LRCN). The 3D-ED-GAN is a 3D convolutional neural network trained with a generative adversarial paradigm to fill missing 3D data in low-resolution. LRCN adopts a recurrent neural network architecture to minimize GPU memory usage and incorporates an Encoder-Decoder pair into a Long Short-term Memory Network. By handling the 3D model as a sequence of 2D slices, LRCN transforms a coarse 3D shape into a more complete and higher resolution volume. While 3D-ED-GAN captures global contextual structure of the 3D shape, LRCN localizes the fine-grained details. Experimental results on both real-world and synthetic data show reconstructions from corrupted models result in complete and high-resolution 3D objects

    Adversarial Feature Sampling Learning for Efficient Visual Tracking

    Full text link
    The tracking-by-detection framework usually consist of two stages: drawing samples around the target object in the first stage and classifying each sample as the target object or background in the second stage. Current popular trackers based on tracking-by-detection framework typically draw samples in the raw image as the inputs of deep convolution networks in the first stage, which usually results in high computational burden and low running speed. In this paper, we propose a new visual tracking method using sampling deep convolutional features to address this problem. Only one cropped image around the target object is input into the designed deep convolution network and the samples is sampled on the feature maps of the network by spatial bilinear resampling. In addition, a generative adversarial network is integrated into our network framework to augment positive samples and improve the tracking performance. Extensive experiments on benchmark datasets demonstrate that the proposed method achieves a comparable performance to state-of-the-art trackers and accelerates tracking-by-detection trackers based on raw-image samples effectively

    Conditional Generative Adversarial Networks for Emoji Synthesis with Word Embedding Manipulation

    Full text link
    Emojis have become a very popular part of daily digital communication. Their appeal comes largely in part due to their ability to capture and elicit emotions in a more subtle and nuanced way than just plain text is able to. In line with recent advances in the field of deep learning, there are far reaching implications and applications that generative adversarial networks (GANs) can have for image generation. In this paper, we present a novel application of deep convolutional GANs (DC-GANs) with an optimized training procedure. We show that via incorporation of word embeddings conditioned on Google's word2vec model into the network, the generator is able to synthesize highly realistic emojis that are virtually identical to the real ones.Comment: 5 pages, 3 figures, 2 graph

    Multi Resolution LSTM For Long Term Prediction In Neural Activity Video

    Full text link
    Epileptic seizures are caused by abnormal, overly syn- chronized, electrical activity in the brain. The abnor- mal electrical activity manifests as waves, propagating across the brain. Accurate prediction of the propagation velocity and direction of these waves could enable real- time responsive brain stimulation to suppress or prevent the seizures entirely. However, this problem is very chal- lenging because the algorithm must be able to predict the neural signals in a sufficiently long time horizon to allow enough time for medical intervention. We consider how to accomplish long term prediction using a LSTM network. To alleviate the vanishing gradient problem, we propose two encoder-decoder-predictor structures, both using multi-resolution representation. The novel LSTM structure with multi-resolution layers could significantly outperform the single-resolution benchmark with similar number of parameters. To overcome the blurring effect associated with video prediction in the pixel domain using standard mean square error (MSE) loss, we use energy- based adversarial training to improve the long-term pre- diction. We demonstrate and analyze how a discriminative model with an encoder-decoder structure using 3D CNN model improves long term prediction

    Context-Aware Semantic Inpainting

    Full text link
    Recently image inpainting has witnessed rapid progress due to generative adversarial networks (GAN) that are able to synthesize realistic contents. However, most existing GAN-based methods for semantic inpainting apply an auto-encoder architecture with a fully connected layer, which cannot accurately maintain spatial information. In addition, the discriminator in existing GANs struggle to understand high-level semantics within the image context and yield semantically consistent content. Existing evaluation criteria are biased towards blurry results and cannot well characterize edge preservation and visual authenticity in the inpainting results. In this paper, we propose an improved generative adversarial network to overcome the aforementioned limitations. Our proposed GAN-based framework consists of a fully convolutional design for the generator which helps to better preserve spatial structures and a joint loss function with a revised perceptual loss to capture high-level semantics in the context. Furthermore, we also introduce two novel measures to better assess the quality of image inpainting results. Experimental results demonstrate that our method outperforms the state of the art under a wide range of criteria

    Deep learning for determining a near-optimal topological design without any iteration

    Full text link
    In this study, we propose a novel deep learning-based method to predict an optimized structure for a given boundary condition and optimization setting without using any iterative scheme. For this purpose, first, using open-source topology optimization code, datasets of the optimized structures paired with the corresponding information on boundary conditions and optimization settings are generated at low (32 x 32) and high (128 x 128) resolutions. To construct the artificial neural network for the proposed method, a convolutional neural network (CNN)-based encoder and decoder network is trained using the training dataset generated at low resolution. Then, as a two-stage refinement, the conditional generative adversarial network (cGAN) is trained with the optimized structures paired at both low and high resolutions, and is connected to the trained CNN-based encoder and decoder network. The performance evaluation results of the integrated network demonstrate that the proposed method can determine a near-optimal structure in terms of pixel values and compliance with negligible computational time.Comment: 27 page, 11 figures, The paper is accepted in the Structural and Multidisciplinary Optimization journal, Springe
    • …
    corecore