5 research outputs found
Multi-View Data Generation Without View Supervision
The development of high-dimensional generative models has recently gained a
great surge of interest with the introduction of variational auto-encoders and
generative adversarial neural networks. Different variants have been proposed
where the underlying latent space is structured, for example, based on
attributes describing the data to generate. We focus on a particular problem
where one aims at generating samples corresponding to a number of objects under
various views. We assume that the distribution of the data is driven by two
independent latent factors: the content, which represents the intrinsic
features of an object, and the view, which stands for the settings of a
particular observation of that object. Therefore, we propose a generative model
and a conditional variant built on such a disentangled latent space. This
approach allows us to generate realistic samples corresponding to various
objects in a high variety of views. Unlike many multi-view approaches, our
model doesn't need any supervision on the views but only on the content.
Compared to other conditional generation approaches that are mostly based on
binary or categorical attributes, we make no such assumption about the factors
of variations. Our model can be used on problems with a huge, potentially
infinite, number of categories. We experiment it on four image datasets on
which we demonstrate the effectiveness of the model and its ability to
generalize.Comment: Published as a conference paper at ICLR 201
Adversarial Disentanglement with Grouped Observations
We consider the disentanglement of the representations of the relevant
attributes of the data (content) from all other factors of variations (style)
using Variational Autoencoders. Some recent works addressed this problem by
utilizing grouped observations, where the content attributes are assumed to be
common within each group, while there is no any supervised information on the
style factors. In many cases, however, these methods fail to prevent the models
from using the style variables to encode content related features as well. This
work supplements these algorithms with a method that eliminates the content
information in the style representations. For that purpose the training
objective is augmented to minimize an appropriately defined mutual information
term in an adversarial way. Experimental results and comparisons on image
datasets show that the resulting method can efficiently separate the content
and style related attributes and generalizes to unseen data.Comment: Accepted at the 34th AAAI Conference on Artificial Intelligence
(AAAI-20