9,644 research outputs found
Generative Adversarial Text to Image Synthesis
Automatic synthesis of realistic images from text would be interesting and
useful, but current AI systems are still far from this goal. However, in recent
years generic and powerful recurrent neural network architectures have been
developed to learn discriminative text feature representations. Meanwhile, deep
convolutional generative adversarial networks (GANs) have begun to generate
highly compelling images of specific categories, such as faces, album covers,
and room interiors. In this work, we develop a novel deep architecture and GAN
formulation to effectively bridge these advances in text and image model- ing,
translating visual concepts from characters to pixels. We demonstrate the
capability of our model to generate plausible images of birds and flowers from
detailed text descriptions.Comment: ICML 201
Probabilistic Label Relation Graphs with Ising Models
We consider classification problems in which the label space has structure. A
common example is hierarchical label spaces, corresponding to the case where
one label subsumes another (e.g., animal subsumes dog). But labels can also be
mutually exclusive (e.g., dog vs cat) or unrelated (e.g., furry, carnivore). To
jointly model hierarchy and exclusion relations, the notion of a HEX (hierarchy
and exclusion) graph was introduced in [7]. This combined a conditional random
field (CRF) with a deep neural network (DNN), resulting in state of the art
results when applied to visual object classification problems where the
training labels were drawn from different levels of the ImageNet hierarchy
(e.g., an image might be labeled with the basic level category "dog", rather
than the more specific label "husky"). In this paper, we extend the HEX model
to allow for soft or probabilistic relations between labels, which is useful
when there is uncertainty about the relationship between two labels (e.g., an
antelope is "sort of" furry, but not to the same degree as a grizzly bear). We
call our new model pHEX, for probabilistic HEX. We show that the pHEX graph can
be converted to an Ising model, which allows us to use existing off-the-shelf
inference methods (in contrast to the HEX method, which needed specialized
inference algorithms). Experimental results show significant improvements in a
number of large-scale visual object classification tasks, outperforming the
previous HEX model.Comment: International Conference on Computer Vision (2015
- …