17 research outputs found
Incremental Adversarial Domain Adaptation for Continually Changing Environments
Continuous appearance shifts such as changes in weather and lighting
conditions can impact the performance of deployed machine learning models.
While unsupervised domain adaptation aims to address this challenge, current
approaches do not utilise the continuity of the occurring shifts. In
particular, many robotics applications exhibit these conditions and thus
facilitate the potential to incrementally adapt a learnt model over minor
shifts which integrate to massive differences over time. Our work presents an
adversarial approach for lifelong, incremental domain adaptation which benefits
from unsupervised alignment to a series of intermediate domains which
successively diverge from the labelled source domain. We empirically
demonstrate that our incremental approach improves handling of large appearance
changes, e.g. day to night, on a traversable-path segmentation task compared
with a direct, single alignment step approach. Furthermore, by approximating
the feature distribution for the source domain with a generative adversarial
network, the deployment module can be rendered fully independent of retaining
potentially large amounts of the related source training data for only a minor
reduction in performance.Comment: International Conference on Robotics and Automation 201
Boosting Deep Open World Recognition by Clustering
While convolutional neural networks have brought significant advances in
robot vision, their ability is often limited to closed world scenarios, where
the number of semantic concepts to be recognized is determined by the available
training set. Since it is practically impossible to capture all possible
semantic concepts present in the real world in a single training set, we need
to break the closed world assumption, equipping our robot with the capability
to act in an open world. To provide such ability, a robot vision system should
be able to (i) identify whether an instance does not belong to the set of known
categories (i.e. open set recognition), and (ii) extend its knowledge to learn
new classes over time (i.e. incremental learning). In this work, we show how we
can boost the performance of deep open world recognition algorithms by means of
a new loss formulation enforcing a global to local clustering of class-specific
features. In particular, a first loss term, i.e. global clustering, forces the
network to map samples closer to the class centroid they belong to while the
second one, local clustering, shapes the representation space in such a way
that samples of the same class get closer in the representation space while
pushing away neighbours belonging to other classes. Moreover, we propose a
strategy to learn class-specific rejection thresholds, instead of heuristically
estimating a single global threshold, as in previous works. Experiments on
RGB-D Object and Core50 datasets show the effectiveness of our approach.Comment: IROS/RAL 202
Semantic Understanding of Foggy Scenes with Purely Synthetic Data
This work addresses the problem of semantic scene understanding under foggy
road conditions. Although marked progress has been made in semantic scene
understanding over the recent years, it is mainly concentrated on clear weather
outdoor scenes. Extending semantic segmentation methods to adverse weather
conditions like fog is crucially important for outdoor applications such as
self-driving cars. In this paper, we propose a novel method, which uses purely
synthetic data to improve the performance on unseen real-world foggy scenes
captured in the streets of Zurich and its surroundings. Our results highlight
the potential and power of photo-realistic synthetic images for training and
especially fine-tuning deep neural nets. Our contributions are threefold, 1) we
created a purely synthetic, high-quality foggy dataset of 25,000 unique outdoor
scenes, that we call Foggy Synscapes and plan to release publicly 2) we show
that with this data we outperform previous approaches on real-world foggy test
data 3) we show that a combination of our data and previously used data can
even further improve the performance on real-world foggy data.Comment: independent class IoU scores corrected for BiSiNet architectur
DAugNet: Unsupervised, Multi-source, Multi-target, and Life-long Domain Adaptation for Semantic Segmentation of Satellite Images
The domain adaptation of satellite images has recently gained an increasing
attention to overcome the limited generalization abilities of machine learning
models when segmenting large-scale satellite images. Most of the existing
approaches seek for adapting the model from one domain to another. However,
such single-source and single-target setting prevents the methods from being
scalable solutions, since nowadays multiple source and target domains having
different data distributions are usually available. Besides, the continuous
proliferation of satellite images necessitates the classifiers to adapt to
continuously increasing data. We propose a novel approach, coined DAugNet, for
unsupervised, multi-source, multi-target, and life-long domain adaptation of
satellite images. It consists of a classifier and a data augmentor. The data
augmentor, which is a shallow network, is able to perform style transfer
between multiple satellite images in an unsupervised manner, even when new data
are added over the time. In each training iteration, it provides the classifier
with diversified data, which makes the classifier robust to large data
distribution difference between the domains. Our extensive experiments prove
that DAugNet significantly better generalizes to new geographic locations than
the existing approaches
On the Challenges of Open World Recognitionunder Shifting Visual Domains
Robotic visual systems operating in the wild must act in unconstrained
scenarios, under different environmental conditions while facing a variety of
semantic concepts, including unknown ones. To this end, recent works tried to
empower visual object recognition methods with the capability to i) detect
unseen concepts and ii) extended their knowledge over time, as images of new
semantic classes arrive. This setting, called Open World Recognition (OWR), has
the goal to produce systems capable of breaking the semantic limits present in
the initial training set. However, this training set imposes to the system not
only its own semantic limits, but also environmental ones, due to its bias
toward certain acquisition conditions that do not necessarily reflect the high
variability of the real-world. This discrepancy between training and test
distribution is called domain-shift. This work investigates whether OWR
algorithms are effective under domain-shift, presenting the first benchmark
setup for assessing fairly the performances of OWR algorithms, with and without
domain-shift. We then use this benchmark to conduct analyses in various
scenarios, showing how existing OWR algorithms indeed suffer a severe
performance degradation when train and test distributions differ. Our analysis
shows that this degradation is only slightly mitigated by coupling OWR with
domain generalization techniques, indicating that the mere plug-and-play of
existing algorithms is not enough to recognize new and unknown categories in
unseen domains. Our results clearly point toward open issues and future
research directions, that need to be investigated for building robot visual
systems able to function reliably under these challenging yet very real
conditions. Code available at
https://github.com/DarioFontanel/OWR-VisualDomainsComment: RAL/ICRA 202
Adversarial Discriminative Sim-to-real Transfer of Visuo-motor Policies
Various approaches have been proposed to learn visuo-motor policies for
real-world robotic applications. One solution is first learning in simulation
then transferring to the real world. In the transfer, most existing approaches
need real-world images with labels. However, the labelling process is often
expensive or even impractical in many robotic applications. In this paper, we
propose an adversarial discriminative sim-to-real transfer approach to reduce
the cost of labelling real data. The effectiveness of the approach is
demonstrated with modular networks in a table-top object reaching task where a
7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter
through visual observations. The adversarial transfer approach reduced the
labelled real data requirement by 50%. Policies can be transferred to real
environments with only 93 labelled and 186 unlabelled real images. The
transferred visuo-motor policies are robust to novel (not seen in training)
objects in clutter and even a moving target, achieving a 97.8% success rate and
1.8 cm control accuracy.Comment: Under review for the International Journal of Robotics Researc
Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation
We address the problem of semantic nighttime image segmentation and improve
the state-of-the-art, by adapting daytime models to nighttime without using
nighttime annotations. Moreover, we design a new evaluation framework to
address the substantial uncertainty of semantics in nighttime images. Our
central contributions are: 1) a curriculum framework to gradually adapt
semantic segmentation models from day to night through progressively darker
times of day, exploiting cross-time-of-day correspondences between daytime
images from a reference map and dark images to guide the label inference in the
dark domains; 2) a novel uncertainty-aware annotation and evaluation framework
and metric for semantic segmentation, including image regions beyond human
recognition capability in the evaluation in a principled fashion; 3) the Dark
Zurich dataset, comprising 2416 unlabeled nighttime and 2920 unlabeled twilight
images with correspondences to their daytime counterparts plus a set of 201
nighttime images with fine pixel-level annotations created with our protocol,
which serves as a first benchmark for our novel evaluation. Experiments show
that our map-guided curriculum adaptation significantly outperforms
state-of-the-art methods on nighttime sets both for standard metrics and our
uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals
that selective invalidation of predictions can improve results on data with
ambiguous content such as our benchmark and profit safety-oriented applications
involving invalid inputs.Comment: IEEE T-PAMI 202