Search CORE

31,832 research outputs found

Enhancing Perceptual Attributes with Bayesian Style Generation

Author: A Sartori
J Liao
P Rossky
Radford Neal
WK Hastings
Publication venue
Publication date: 03/12/2018
Field of study

Deep learning has brought an unprecedented progress in computer vision and significant advances have been made in predicting subjective properties inherent to visual data (e.g., memorability, aesthetic quality, evoked emotions, etc.). Recently, some research works have even proposed deep learning approaches to modify images such as to appropriately alter these properties. Following this research line, this paper introduces a novel deep learning framework for synthesizing images in order to enhance a predefined perceptual attribute. Our approach takes as input a natural image and exploits recent models for deep style transfer and generative adversarial networks to change its style in order to modify a specific high-level attribute. Differently from previous works focusing on enhancing a specific property of a visual content, we propose a general framework and demonstrate its effectiveness in two use cases, i.e. increasing image memorability and generating scary pictures. We evaluate the proposed approach on publicly available benchmarks, demonstrating its advantages over state of the art methods.Comment: ACCV-201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning

Author: Akata Zeynep
Schiele Bernt
Sharma Saurabh
Xian Yongqin
Publication venue
Publication date: 01/01/2019
Field of study

When labeled training data is scarce, a promising data augmentation approach is to generate visual features of unknown classes using their attributes. To learn the class conditional distribution of CNN features, these models rely on pairs of image features and class attributes. Hence, they can not make use of the abundance of unlabeled data samples. In this paper, we tackle any-shot learning problems i.e. zero-shot and few-shot, in a unified feature generating framework that operates in both inductive and transductive learning settings. We develop a conditional generative model that combines the strength of VAE and GANs and in addition, via an unconditional discriminator, learns the marginal feature distribution of unlabeled images. We empirically show that our model learns highly discriminative CNN features for five datasets, i.e. CUB, SUN, AWA and ImageNet, and establish a new state-of-the-art in any-shot learning, i.e. inductive and transductive (generalized) zero- and few-shot learning settings. We also demonstrate that our learned features are interpretable: we visualize them by inverting them back to the pixel space and we explain them by generating textual arguments of why they are associated with a certain label.Comment: Accepted at CVPR 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

International Migration, Integration and Social Cohesion online publications

Generative Adversarial Text to Image Synthesis

Author: Akata Zeynep
Lee Honglak
Logeswaran Lajanugen
Reed Scott
Schiele Bernt
Yan Xinchen
Publication venue
Publication date: 01/01/2016
Field of study

Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly compelling images of specific categories, such as faces, album covers, and room interiors. In this work, we develop a novel deep architecture and GAN formulation to effectively bridge these advances in text and image model- ing, translating visual concepts from characters to pixels. We demonstrate the capability of our model to generate plausible images of birds and flowers from detailed text descriptions.Comment: ICML 201

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Manipulating Attributes of Natural Scenes via Hallucination

Author: Akata Zeynep
Erdem Aykut
Erdem Erkut
Karacan Levent
Publication venue
Publication date: 01/01/2018
Field of study

In this study, we explore building a two-stage framework for enabling users to directly manipulate high-level attributes of a natural scene. The key to our approach is a deep generative network which can hallucinate images of a scene as if they were taken at a different season (e.g. during winter), weather condition (e.g. in a cloudy day) or time of the day (e.g. at sunset). Once the scene is hallucinated with the given attributes, the corresponding look is then transferred to the input image while preserving the semantic details intact, giving a photo-realistic manipulation result. As the proposed framework hallucinates what the scene will look like, it does not require any reference style image as commonly utilized in most of the appearance or style transfer approaches. Moreover, it allows to simultaneously manipulate a given scene according to a diverse set of transient attributes within a single model, eliminating the need of training multiple networks per each translation task. Our comprehensive set of qualitative and quantitative results demonstrate the effectiveness of our approach against the competing methods.Comment: Accepted for publication in ACM Transactions on Graphic

arXiv.org e-Print Archive

MPG.PuRe

Fader Networks: Manipulating Images by Sliding Attributes

Author: Bordes Antoine
Denoyer Ludovic
Lample Guillaume
Ranzato Marc'Aurelio
Usunier Nicolas
Zeghidour Neil
Publication venue
Publication date: 04/12/2017
Field of study

This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space. As a result, after training, our model can generate different realistic versions of an input image by varying the attribute values. By using continuous attribute values, we can choose how much a specific attribute is perceivable in the generated image. This property could allow for applications where users can modify an image using sliding knobs, like faders on a mixing console, to change the facial expression of a portrait, or to update the color of some objects. Compared to the state-of-the-art which mostly relies on training adversarial networks in pixel space by altering attribute values at train time, our approach results in much simpler training schemes and nicely scales to multiple attributes. We present evidence that our model can significantly change the perceived value of the attributes while preserving the naturalness of images.Comment: NIPS 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server