381 research outputs found
Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation
We introduce a new loss function for the weakly-supervised training of
semantic image segmentation models based on three guiding principles: to seed
with weak localization cues, to expand objects based on the information about
which classes can occur in an image, and to constrain the segmentations to
coincide with object boundaries. We show experimentally that training a deep
convolutional neural network using the proposed loss function leads to
substantially better segmentations than previous state-of-the-art methods on
the challenging PASCAL VOC 2012 dataset. We furthermore give insight into the
working mechanism of our method by a detailed experimental study that
illustrates how the segmentation quality is affected by each term of the
proposed loss function as well as their combinations.Comment: ECCV 201
Image to Image Translation for Domain Adaptation
We propose a general framework for unsupervised domain adaptation, which
allows deep neural networks trained on a source domain to be tested on a
different target domain without requiring any training annotations in the
target domain. This is achieved by adding extra networks and losses that help
regularize the features extracted by the backbone encoder network. To this end
we propose the novel use of the recently proposed unpaired image-toimage
translation framework to constrain the features extracted by the encoder
network. Specifically, we require that the features extracted are able to
reconstruct the images in both domains. In addition we require that the
distribution of features extracted from images in the two domains are
indistinguishable. Many recent works can be seen as specific cases of our
general framework. We apply our method for domain adaptation between MNIST,
USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in
classification tasks, and also between GTA5 and Cityscapes datasets for a
segmentation task. We demonstrate state of the art performance on each of these
datasets
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
Exploiting synthetic data to learn deep models has attracted increasing
attention in recent years. However, the intrinsic domain difference between
synthetic and real images usually causes a significant performance drop when
applying the learned model to real world scenarios. This is mainly due to two
reasons: 1) the model overfits to synthetic images, making the convolutional
filters incompetent to extract informative representation for real images; 2)
there is a distribution difference between synthetic and real data, which is
also known as the domain adaptation problem. To this end, we propose a new
reality oriented adaptation approach for urban scene semantic segmentation by
learning from synthetic data. First, we propose a target guided distillation
approach to learn the real image style, which is achieved by training the
segmentation model to imitate a pretrained real style model using real images.
Second, we further take advantage of the intrinsic spatial structure presented
in urban scene images, and propose a spatial-aware adaptation scheme to
effectively align the distribution of two domains. These two modules can be
readily integrated with existing state-of-the-art semantic segmentation
networks to improve their generalizability when adapting from synthetic to real
urban scenes. We evaluate the proposed method on Cityscapes dataset by adapting
from GTAV and SYNTHIA datasets, where the results demonstrate the effectiveness
of our method.Comment: Add experiments on SYNTHIA, CVPR 2018 camera-ready versio
Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes
During the last half decade, convolutional neural networks (CNNs) have
triumphed over semantic segmentation, which is one of the core tasks in many
applications such as autonomous driving. However, to train CNNs requires a
considerable amount of data, which is difficult to collect and laborious to
annotate. Recent advances in computer graphics make it possible to train CNNs
on photo-realistic synthetic imagery with computer-generated annotations.
Despite this, the domain mismatch between the real images and the synthetic
data cripples the models' performance. Hence, we propose a curriculum-style
learning approach to minimize the domain gap in urban scenery semantic
segmentation. The curriculum domain adaptation solves easy tasks first to infer
necessary properties about the target domain; in particular, the first task is
to learn global label distributions over images and local distributions over
landmark superpixels. These are easy to estimate because images of urban scenes
have strong idiosyncrasies (e.g., the size and spatial relations of buildings,
streets, cars, etc.). We then train a segmentation network while regularizing
its predictions in the target domain to follow those inferred properties. In
experiments, our method outperforms the baselines on two datasets and two
backbone networks. We also report extensive ablation studies about our
approach.Comment: This is the extended version of the ICCV 2017 paper "Curriculum
Domain Adaptation for Semantic Segmentation of Urban Scenes" with additional
GTA experimen
- …