145,342 research outputs found
A Domain Agnostic Normalization Layer for Unsupervised Adversarial Domain Adaptation
We propose a normalization layer for unsupervised domain adaption in semantic
scene segmentation. Normalization layers are known to improve convergence and
generalization and are part of many state-of-the-art fully-convolutional neural
networks. We show that conventional normalization layers worsen the performance
of current Unsupervised Adversarial Domain Adaption (UADA), which is a method
to improve network performance on unlabeled datasets and the focus of our
research. Therefore, we propose a novel Domain Agnostic Normalization layer and
thereby unlock the benefits of normalization layers for unsupervised
adversarial domain adaptation. In our evaluation, we adapt from the synthetic
GTA5 data set to the real Cityscapes data set, a common benchmark experiment,
and surpass the state-of-the-art. As our normalization layer is domain agnostic
at test time, we furthermore demonstrate that UADA using Domain Agnostic
Normalization improves performance on unseen domains, specifically on
Apolloscape and Mapillary
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
Environmental audio tagging aims to predict only the presence or absence of
certain acoustic events in the interested acoustic scene. In this paper we make
contributions to audio tagging in two parts, respectively, acoustic modeling
and feature learning. We propose to use a shrinking deep neural network (DNN)
framework incorporating unsupervised feature learning to handle the multi-label
classification task. For the acoustic modeling, a large set of contextual
frames of the chunk are fed into the DNN to perform a multi-label
classification for the expected tags, considering that only chunk (or
utterance) level rather than frame-level labels are available. Dropout and
background noise aware training are also adopted to improve the generalization
capability of the DNNs. For the unsupervised feature learning, we propose to
use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to
generate new data-driven features from the Mel-Filter Banks (MFBs) features.
The new features, which are smoothed against background noise and more compact
with contextual information, can further improve the performance of the DNN
baseline. Compared with the standard Gaussian Mixture Model (GMM) baseline of
the DCASE 2016 audio tagging challenge, our proposed method obtains a
significant equal error rate (EER) reduction from 0.21 to 0.13 on the
development set. The proposed aDAE system can get a relative 6.7% EER reduction
compared with the strong DNN baseline on the development set. Finally, the
results also show that our approach obtains the state-of-the-art performance
with 0.15 EER on the evaluation set of the DCASE 2016 audio tagging task while
EER of the first prize of this challenge is 0.17.Comment: 10 pages, dcase 2016 challeng
- …
