4,786 research outputs found
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
Exploiting synthetic data to learn deep models has attracted increasing
attention in recent years. However, the intrinsic domain difference between
synthetic and real images usually causes a significant performance drop when
applying the learned model to real world scenarios. This is mainly due to two
reasons: 1) the model overfits to synthetic images, making the convolutional
filters incompetent to extract informative representation for real images; 2)
there is a distribution difference between synthetic and real data, which is
also known as the domain adaptation problem. To this end, we propose a new
reality oriented adaptation approach for urban scene semantic segmentation by
learning from synthetic data. First, we propose a target guided distillation
approach to learn the real image style, which is achieved by training the
segmentation model to imitate a pretrained real style model using real images.
Second, we further take advantage of the intrinsic spatial structure presented
in urban scene images, and propose a spatial-aware adaptation scheme to
effectively align the distribution of two domains. These two modules can be
readily integrated with existing state-of-the-art semantic segmentation
networks to improve their generalizability when adapting from synthetic to real
urban scenes. We evaluate the proposed method on Cityscapes dataset by adapting
from GTAV and SYNTHIA datasets, where the results demonstrate the effectiveness
of our method.Comment: Add experiments on SYNTHIA, CVPR 2018 camera-ready versio
CANU-ReID: A Conditional Adversarial Network for Unsupervised person Re-IDentification
Unsupervised person re-ID is the task of identifying people on a target data
set for which the ID labels are unavailable during training. In this paper, we
propose to unify two trends in unsupervised person re-ID: clustering &
fine-tuning and adversarial learning. On one side, clustering groups training
images into pseudo-ID labels, and uses them to fine-tune the feature extractor.
On the other side, adversarial learning is used, inspired by domain adaptation,
to match distributions from different domains. Since target data is distributed
across different camera viewpoints, we propose to model each camera as an
independent domain, and aim to learn domain-independent features.
Straightforward adversarial learning yields negative transfer, we thus
introduce a conditioning vector to mitigate this undesirable effect. In our
framework, the centroid of the cluster to which the visual sample belongs is
used as conditioning vector of our conditional adversarial network, where the
vector is permutation invariant (clusters ordering does not matter) and its
size is independent of the number of clusters. To our knowledge, we are the
first to propose the use of conditional adversarial networks for unsupervised
person re-ID. We evaluate the proposed architecture on top of two
state-of-the-art clustering-based unsupervised person re-identification (re-ID)
methods on four different experimental settings with three different data sets
and set the new state-of-the-art performance on all four of them. Our code and
model will be made publicly available at
https://team.inria.fr/perception/canu-reid/
Socially Compliant Navigation through Raw Depth Inputs with Generative Adversarial Imitation Learning
We present an approach for mobile robots to learn to navigate in dynamic
environments with pedestrians via raw depth inputs, in a socially compliant
manner. To achieve this, we adopt a generative adversarial imitation learning
(GAIL) strategy, which improves upon a pre-trained behavior cloning policy. Our
approach overcomes the disadvantages of previous methods, as they heavily
depend on the full knowledge of the location and velocity information of nearby
pedestrians, which not only requires specific sensors, but also the extraction
of such state information from raw sensory input could consume much computation
time. In this paper, our proposed GAIL-based model performs directly on raw
depth inputs and plans in real-time. Experiments show that our GAIL-based
approach greatly improves the safety and efficiency of the behavior of mobile
robots from pure behavior cloning. The real-world deployment also shows that
our method is capable of guiding autonomous vehicles to navigate in a socially
compliant manner directly through raw depth inputs. In addition, we release a
simulation plugin for modeling pedestrian behaviors based on the social force
model.Comment: ICRA 2018 camera-ready version. 7 pages, video link:
https://www.youtube.com/watch?v=0hw0GD3lkA
EEG-Based Emotion Recognition Using Regularized Graph Neural Networks
Electroencephalography (EEG) measures the neuronal activities in different
brain regions via electrodes. Many existing studies on EEG-based emotion
recognition do not fully exploit the topology of EEG channels. In this paper,
we propose a regularized graph neural network (RGNN) for EEG-based emotion
recognition. RGNN considers the biological topology among different brain
regions to capture both local and global relations among different EEG
channels. Specifically, we model the inter-channel relations in EEG signals via
an adjacency matrix in a graph neural network where the connection and
sparseness of the adjacency matrix are inspired by neuroscience theories of
human brain organization. In addition, we propose two regularizers, namely
node-wise domain adversarial training (NodeDAT) and emotion-aware distribution
learning (EmotionDL), to better handle cross-subject EEG variations and noisy
labels, respectively. Extensive experiments on two public datasets, SEED and
SEED-IV, demonstrate the superior performance of our model than
state-of-the-art models in most experimental settings. Moreover, ablation
studies show that the proposed adjacency matrix and two regularizers contribute
consistent and significant gain to the performance of our RGNN model. Finally,
investigations on the neuronal activities reveal important brain regions and
inter-channel relations for EEG-based emotion recognition
- …