198 research outputs found
On the regularization of Wasserstein GANs
Since their invention, generative adversarial networks (GANs) have become a
popular approach for learning to model a distribution of real (unlabeled) data.
Convergence problems during training are overcome by Wasserstein GANs which
minimize the distance between the model and the empirical distribution in terms
of a different metric, but thereby introduce a Lipschitz constraint into the
optimization problem. A simple way to enforce the Lipschitz constraint on the
class of functions, which can be modeled by the neural network, is weight
clipping. It was proposed that training can be improved by instead augmenting
the loss by a regularization term that penalizes the deviation of the gradient
of the critic (as a function of the network's input) from one. We present
theoretical arguments why using a weaker regularization term enforcing the
Lipschitz constraint is preferable. These arguments are supported by
experimental results on toy data sets.Comment: Published as a conference paper at ICLR 2018. * Henning Petzka and
Asja Fischer contributed equally to this work (11 pages +13 pages appendix
Complexity Matters: Rethinking the Latent Space for Generative Modeling
In generative modeling, numerous successful approaches leverage a
low-dimensional latent space, e.g., Stable Diffusion models the latent space
induced by an encoder and generates images through a paired decoder. Although
the selection of the latent space is empirically pivotal, determining the
optimal choice and the process of identifying it remain unclear. In this study,
we aim to shed light on this under-explored topic by rethinking the latent
space from the perspective of model complexity. Our investigation starts with
the classic generative adversarial networks (GANs). Inspired by the GAN
training objective, we propose a novel "distance" between the latent and data
distributions, whose minimization coincides with that of the generator
complexity. The minimizer of this distance is characterized as the optimal
data-dependent latent that most effectively capitalizes on the generator's
capacity. Then, we consider parameterizing such a latent distribution by an
encoder network and propose a two-stage training strategy called Decoupled
Autoencoder (DAE), where the encoder is only updated in the first stage with an
auxiliary decoder and then frozen in the second stage while the actual decoder
is being trained. DAE can improve the latent distribution and as a result,
improve the generative performance. Our theoretical analyses are corroborated
by comprehensive experiments on various models such as VQGAN and Diffusion
Transformer, where our modifications yield significant improvements in sample
quality with decreased model complexity.Comment: Accepted to NeurIPS 2023 (Spotlight
Generative adversarial networks review in earthquake-related engineering fields
Within seismology, geology, civil and structural engineering, deep learning (DL), especially via generative adversarial networks (GANs), represents an innovative, engaging, and advantageous way to generate reliable synthetic data that represent actual samples' characteristics, providing a handy data augmentation tool. Indeed, in many practical applications, obtaining a significant number of high-quality information is demanding. Data augmentation is generally based on artificial intelligence (AI) and machine learning data-driven models. The DL GAN-based data augmentation approach for generating synthetic seismic signals revolutionized the current data augmentation paradigm. This study delivers a critical state-of-art review, explaining recent research into AI-based GAN synthetic generation of ground motion signals or seismic events, and also with a comprehensive insight into seismic-related geophysical studies. This study may be relevant, especially for the earth and planetary science, geology and seismology, oil and gas exploration, and on the other hand for assessing the seismic response of buildings and infrastructures, seismic detection tasks, and general structural and civil engineering applications. Furthermore, highlighting the strengths and limitations of the current studies on adversarial learning applied to seismology may help to guide research efforts in the next future toward the most promising directions
- …