15 research outputs found
Semi-supervised FusedGAN for Conditional Image Generation
We present FusedGAN, a deep network for conditional image synthesis with
controllable sampling of diverse images. Fidelity, diversity and controllable
sampling are the main quality measures of a good image generation model. Most
existing models are insufficient in all three aspects. The FusedGAN can perform
controllable sampling of diverse images with very high fidelity. We argue that
controllability can be achieved by disentangling the generation process into
various stages. In contrast to stacked GANs, where multiple stages of GANs are
trained separately with full supervision of labeled intermediate images, the
FusedGAN has a single stage pipeline with a built-in stacking of GANs. Unlike
existing methods, which requires full supervision with paired conditions and
images, the FusedGAN can effectively leverage more abundant images without
corresponding conditions in training, to produce more diverse samples with high
fidelity. We achieve this by fusing two generators: one for unconditional image
generation, and the other for conditional image generation, where the two
partly share a common latent space thereby disentangling the generation. We
demonstrate the efficacy of the FusedGAN in fine grained image generation tasks
such as text-to-image, and attribute-to-face generation
Learning with Unavailable Data: Generalized and Open Zero-Shot Learning
The field of visual object recognition has seen a significant progress in recent years thanks to the availability of large-scale annotated datasets. However, labelling a large amount of data is difficult and costly and can be simply infeasible for some classes due to the long-tail instances distribution problem.
Zero-Shot Learning (ZSL) is a framework that consider the case in which for some of the classes no labeled training examples are available to train the model. To solve the problem a multi-modal source of information, the class (semantic) embeddings, is exploited to extract knowledge from the available classes, the seen classes, and recognize novel categories for which the class embeddings is the only information available, namely, the unseen classes.
To directly targeting the extreme imbalance in the data, in this thesis, we first propose a methodology to improve synthetic data generation for the unseen classes through their class embeddings. Second, we propose to generalize the Zero-Shot Learning framework towards a more competitive and real-world oriented scenario. Thus, we formalize the problem of Open Zero-Shot Learning as the problem of recognizing seen and unseen classes, as in ZSL, while also rejecting instances from unknown categories, for which neither visual data nor class embeddings are provided. Finally, we propose methodologies to not only generate unseen categories, but also the unknown ones