Search CORE

282 research outputs found

Bidirectional Conditional Generative Adversarial Networks

Author: AbdAlmageed Wael
Jaiswal Ayush
Natarajan Premkumar
Wu Yue
Publication venue
Publication date: 03/11/2018
Field of study

Conditional Generative Adversarial Networks (cGANs) are generative models that can produce data samples (

x

) conditioned on both latent variables (

z

) and known auxiliary information (

c

). We propose the Bidirectional cGAN (BiCoGAN), which effectively disentangles

z

and

c

in the generation process and provides an encoder that learns inverse mappings from

x

to both

z

and

c

, trained jointly with the generator and the discriminator. We present crucial techniques for training BiCoGANs, which involve an extrinsic factor loss along with an associated dynamically-tuned importance weight. As compared to other encoder-based cGANs, BiCoGANs encode

c

more accurately, and utilize

z

and

c

more effectively and in a more disentangled way to generate samples.Comment: To appear in Proceedings of ACCV 201

arXiv.org e-Print Archive

Crossref

A survey on generative adversarial networks for imbalance problems in computer vision tasks

Author: Aguilar Martín J.J.
Gutierrez A.
Maurtua I.
Sampath V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms

Repositorio Universidad de Zaragoza

Disentangling Factors of Variation by Mixing Them

Author: Favaro Paolo
Hu Qiyang
Portenier Tiziano
Szabó Attila
Zwicker Matthias
Publication venue
Publication date: 28/03/2018
Field of study

We propose an approach to learn image representations that consist of disentangled factors of variation without exploiting any manual labeling or data domain knowledge. A factor of variation corresponds to an image attribute that can be discerned consistently across a set of images, such as the pose or color of objects. Our disentangled representation consists of a concatenation of feature chunks, each chunk representing a factor of variation. It supports applications such as transferring attributes from one image to another, by simply mixing and unmixing feature chunks, and classification or retrieval based on one or several attributes, by considering a user-specified subset of feature chunks. We learn our representation without any labeling or knowledge of the data domain, using an autoencoder architecture with two novel training objectives: first, we propose an invariance objective to encourage that encoding of each attribute, and decoding of each chunk, are invariant to changes in other attributes and chunks, respectively; second, we include a classification objective, which ensures that each chunk corresponds to a consistently discernible attribute in the represented image, hence avoiding degenerate feature mappings where some chunks are completely ignored. We demonstrate the effectiveness of our approach on the MNIST, Sprites, and CelebA datasets.Comment: CVPR 201

arXiv.org e-Print Archive

Crossref

Bern Open Repository and Information System (BORIS)