Search CORE

7,915 research outputs found

Learning Dilation Factors for Semantic Segmentation of Street Scenes

Author: DE Rumelhart
GJ Brostow
M Everingham
SR Richter
T-Y Lin
Publication venue
Publication date: 01/01/2017
Field of study

Contextual information is crucial for semantic segmentation. However, finding the optimal trade-off between keeping desired fine details and at the same time providing sufficiently large receptive fields is non trivial. This is even more so, when objects or classes present in an image significantly vary in size. Dilated convolutions have proven valuable for semantic segmentation, because they allow to increase the size of the receptive field without sacrificing image resolution. However, in current state-of-the-art methods, dilation parameters are hand-tuned and fixed. In this paper, we present an approach for learning dilation parameters adaptively per channel, consistently improving semantic segmentation results on street-scene datasets like Cityscapes and Camvid.Comment: GCPR201

arXiv.org e-Print Archive

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit

MAnnheim DOCument Server

MPG.PuRe

Manipulating Attributes of Natural Scenes via Hallucination

Author: Akata Zeynep
Erdem Aykut
Erdem Erkut
Karacan Levent
Publication venue
Publication date: 01/01/2018
Field of study

In this study, we explore building a two-stage framework for enabling users to directly manipulate high-level attributes of a natural scene. The key to our approach is a deep generative network which can hallucinate images of a scene as if they were taken at a different season (e.g. during winter), weather condition (e.g. in a cloudy day) or time of the day (e.g. at sunset). Once the scene is hallucinated with the given attributes, the corresponding look is then transferred to the input image while preserving the semantic details intact, giving a photo-realistic manipulation result. As the proposed framework hallucinates what the scene will look like, it does not require any reference style image as commonly utilized in most of the appearance or style transfer approaches. Moreover, it allows to simultaneously manipulate a given scene according to a diverse set of transient attributes within a single model, eliminating the need of training multiple networks per each translation task. Our comprehensive set of qualitative and quantitative results demonstrate the effectiveness of our approach against the competing methods.Comment: Accepted for publication in ACM Transactions on Graphic

arXiv.org e-Print Archive

MPG.PuRe

Dual-Domain Image Synthesis using Segmentation-Guided GAN

Author: Bazazian Dena
Calway Andrew
Damen Dima
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 19/04/2022
Field of study

We introduce a segmentation-guided approach to synthesise images that integrate features from two distinct domains. Images synthesised by our dual-domain model belong to one domain within the semantic mask, and to another in the rest of the image - smoothly integrated. We build on the successes of few-shot StyleGAN and single-shot semantic segmentation to minimise the amount of training required in utilising two domains. The method combines a few-shot cross-domain StyleGAN with a latent optimiser to achieve images containing features of two distinct domains. We use a segmentation-guided perceptual loss, which compares both pixel-level and activations between domain-specific and dual-domain synthetic images. Results demonstrate qualitatively and quantitatively that our model is capable of synthesising dual-domain images on a variety of objects (faces, horses, cats, cars), domains (natural, caricature, sketches) and part-based masks (eyes, nose, mouth, hair, car bonnet). The code is publicly available at: https://github.com/denabazazian/Dual-Domain-Synthesis.Comment: CVPR2022 Workshops. 14 pages, 19 figure

arXiv.org e-Print Archive

Explore Bristol Research

Learning Object Categories From Internet Image Searches

Author: Fergus Rob
Li Fei-Fei
Perona Pietro
Zisserman Andrew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

In this paper, we describe a simple approach to learning models of visual object categories from images gathered from Internet image search engines. The images for a given keyword are typically highly variable, with a large fraction being unrelated to the query term, and thus pose a challenging environment from which to learn. By training our models directly from Internet images, we remove the need to laboriously compile training data sets, required by most other recognition approaches-this opens up the possibility of learning object category models “on-the-fly.” We describe two simple approaches, derived from the probabilistic latent semantic analysis (pLSA) technique for text document analysis, that can be used to automatically learn object models from these data. We show two applications of the learned model: first, to rerank the images returned by the search engine, thus improving the quality of the search engine; and second, to recognize objects in other image data sets

Caltech Authors

Oxford University Research Archive

Dual-Domain Image Synthesis using Segmentation-Guided GAN

Author: Bazazian Dena
Calway Andrew
Damen Dima
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/08/2022
Field of study

Explore Bristol Research