18,722 research outputs found
High-resolution Deep Convolutional Generative Adversarial Networks
Generative Adversarial Networks (GANs) [Goodfellow et al. 2014] convergence
in a high-resolution setting with a computational constrain of GPU memory
capacity has been beset with difficulty due to the known lack of convergence
rate stability. In order to boost network convergence of DCGAN (Deep
Convolutional Generative Adversarial Networks) [Radford et al. 2016] and
achieve good-looking high-resolution results we propose a new layered network,
HDCGAN, that incorporates current state-of-the-art techniques for this effect.
Glasses, a mechanism to arbitrarily improve the final GAN generated results by
enlarging the input size by a telescope {\zeta} is also presented. A novel
bias-free dataset, Curt\'o & Zarza, containing human faces from different
ethnical groups in a wide variety of illumination conditions and image
resolutions is introduced. Curt\'o is enhanced with HDCGAN synthetic images,
thus being the first GAN augmented dataset of faces. We conduct extensive
experiments on CelebA [Liu et al. 2015], CelebA-hq [Karras et al. 2018] and
Curt\'o. HDCGAN is the current state-of-the-art in synthetic image generation
on CelebA achieving a MS-SSIM of 0.1978 and a FR\'ECHET Inception Distance of
8.44
TGAN: Deep Tensor Generative Adversarial Nets for Large Image Generation
Deep generative models have been successfully applied to many applications.
However, existing works experience limitations when generating large images
(the literature usually generates small images, e.g. 32 * 32 or 128 * 128). In
this paper, we propose a novel scheme, called deep tensor adversarial
generative nets (TGAN), that generates large high-quality images by exploring
tensor structures. Essentially, the adversarial process of TGAN takes place in
a tensor space. First, we impose tensor structures for concise image
representation, which is superior in capturing the pixel proximity information
and the spatial patterns of elementary objects in images, over the
vectorization preprocess in existing works. Secondly, we propose TGAN that
integrates deep convolutional generative adversarial networks and tensor
super-resolution in a cascading manner, to generate high-quality images from
random distributions. More specifically, we design a tensor super-resolution
process that consists of tensor dictionary learning and tensor coefficients
learning. Finally, on three datasets, the proposed TGAN generates images with
more realistic textures, compared with state-of-the-art adversarial
autoencoders. The size of the generated images is increased by over 8.5 times,
namely 374 * 374 in PASCAL2
Megapixel Size Image Creation using Generative Adversarial Networks
Since its appearance, Generative Adversarial Networks (GANs) have received a
lot of interest in the AI community. In image generation several projects
showed how GANs are able to generate photorealistic images but the results so
far did not look adequate for the quality standard of visual media production
industry. We present an optimized image generation process based on a Deep
Convolutional Generative Adversarial Networks (DCGANs), in order to create
photorealistic high-resolution images (up to 1024x1024 pixels). Furthermore,
the system was fed with a limited dataset of images, less than two thousand
images. All these results give more clue about future exploitation of GANs in
Computer Graphics and Visual Effects.Comment: 3 pages, 4 figure
High-Quality Face Image SR Using Conditional Generative Adversarial Networks
We propose a novel single face image super-resolution method, which named
Face Conditional Generative Adversarial Network(FCGAN), based on boundary
equilibrium generative adversarial networks. Without taking any facial prior
information, our method can generate a high-resolution face image from a
low-resolution one. Compared with existing studies, both our training and
testing phases are end-to-end pipeline with little pre/post-processing. To
enhance the convergence speed and strengthen feature propagation, skip-layer
connection is further employed in the generative and discriminative networks.
Extensive experiments demonstrate that our model achieves competitive
performance compared with state-of-the-art models.Comment: 9 pages, 4 figure
Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks
Recent advances in convolutional neural networks have shown promising results
in 3D shape completion. But due to GPU memory limitations, these methods can
only produce low-resolution outputs. To inpaint 3D models with semantic
plausibility and contextual details, we introduce a hybrid framework that
combines a 3D Encoder-Decoder Generative Adversarial Network (3D-ED-GAN) and a
Long-term Recurrent Convolutional Network (LRCN). The 3D-ED-GAN is a 3D
convolutional neural network trained with a generative adversarial paradigm to
fill missing 3D data in low-resolution. LRCN adopts a recurrent neural network
architecture to minimize GPU memory usage and incorporates an Encoder-Decoder
pair into a Long Short-term Memory Network. By handling the 3D model as a
sequence of 2D slices, LRCN transforms a coarse 3D shape into a more complete
and higher resolution volume. While 3D-ED-GAN captures global contextual
structure of the 3D shape, LRCN localizes the fine-grained details.
Experimental results on both real-world and synthetic data show reconstructions
from corrupted models result in complete and high-resolution 3D objects
Adversarial Feature Sampling Learning for Efficient Visual Tracking
The tracking-by-detection framework usually consist of two stages: drawing
samples around the target object in the first stage and classifying each sample
as the target object or background in the second stage. Current popular
trackers based on tracking-by-detection framework typically draw samples in the
raw image as the inputs of deep convolution networks in the first stage, which
usually results in high computational burden and low running speed. In this
paper, we propose a new visual tracking method using sampling deep
convolutional features to address this problem. Only one cropped image around
the target object is input into the designed deep convolution network and the
samples is sampled on the feature maps of the network by spatial bilinear
resampling. In addition, a generative adversarial network is integrated into
our network framework to augment positive samples and improve the tracking
performance. Extensive experiments on benchmark datasets demonstrate that the
proposed method achieves a comparable performance to state-of-the-art trackers
and accelerates tracking-by-detection trackers based on raw-image samples
effectively
Conditional Generative Adversarial Networks for Emoji Synthesis with Word Embedding Manipulation
Emojis have become a very popular part of daily digital communication. Their
appeal comes largely in part due to their ability to capture and elicit
emotions in a more subtle and nuanced way than just plain text is able to. In
line with recent advances in the field of deep learning, there are far reaching
implications and applications that generative adversarial networks (GANs) can
have for image generation. In this paper, we present a novel application of
deep convolutional GANs (DC-GANs) with an optimized training procedure. We show
that via incorporation of word embeddings conditioned on Google's word2vec
model into the network, the generator is able to synthesize highly realistic
emojis that are virtually identical to the real ones.Comment: 5 pages, 3 figures, 2 graph
Multi Resolution LSTM For Long Term Prediction In Neural Activity Video
Epileptic seizures are caused by abnormal, overly syn- chronized, electrical
activity in the brain. The abnor- mal electrical activity manifests as waves,
propagating across the brain. Accurate prediction of the propagation velocity
and direction of these waves could enable real- time responsive brain
stimulation to suppress or prevent the seizures entirely. However, this problem
is very chal- lenging because the algorithm must be able to predict the neural
signals in a sufficiently long time horizon to allow enough time for medical
intervention. We consider how to accomplish long term prediction using a LSTM
network. To alleviate the vanishing gradient problem, we propose two
encoder-decoder-predictor structures, both using multi-resolution
representation. The novel LSTM structure with multi-resolution layers could
significantly outperform the single-resolution benchmark with similar number of
parameters. To overcome the blurring effect associated with video prediction in
the pixel domain using standard mean square error (MSE) loss, we use energy-
based adversarial training to improve the long-term pre- diction. We
demonstrate and analyze how a discriminative model with an encoder-decoder
structure using 3D CNN model improves long term prediction
Context-Aware Semantic Inpainting
Recently image inpainting has witnessed rapid progress due to generative
adversarial networks (GAN) that are able to synthesize realistic contents.
However, most existing GAN-based methods for semantic inpainting apply an
auto-encoder architecture with a fully connected layer, which cannot accurately
maintain spatial information. In addition, the discriminator in existing GANs
struggle to understand high-level semantics within the image context and yield
semantically consistent content. Existing evaluation criteria are biased
towards blurry results and cannot well characterize edge preservation and
visual authenticity in the inpainting results. In this paper, we propose an
improved generative adversarial network to overcome the aforementioned
limitations. Our proposed GAN-based framework consists of a fully convolutional
design for the generator which helps to better preserve spatial structures and
a joint loss function with a revised perceptual loss to capture high-level
semantics in the context. Furthermore, we also introduce two novel measures to
better assess the quality of image inpainting results. Experimental results
demonstrate that our method outperforms the state of the art under a wide range
of criteria
Deep learning for determining a near-optimal topological design without any iteration
In this study, we propose a novel deep learning-based method to predict an
optimized structure for a given boundary condition and optimization setting
without using any iterative scheme. For this purpose, first, using open-source
topology optimization code, datasets of the optimized structures paired with
the corresponding information on boundary conditions and optimization settings
are generated at low (32 x 32) and high (128 x 128) resolutions. To construct
the artificial neural network for the proposed method, a convolutional neural
network (CNN)-based encoder and decoder network is trained using the training
dataset generated at low resolution. Then, as a two-stage refinement, the
conditional generative adversarial network (cGAN) is trained with the optimized
structures paired at both low and high resolutions, and is connected to the
trained CNN-based encoder and decoder network. The performance evaluation
results of the integrated network demonstrate that the proposed method can
determine a near-optimal structure in terms of pixel values and compliance with
negligible computational time.Comment: 27 page, 11 figures, The paper is accepted in the Structural and
Multidisciplinary Optimization journal, Springe
- …