203 research outputs found
Transductive Multi-label Zero-shot Learning
Zero-shot learning has received increasing interest as a means to alleviate
the often prohibitive expense of annotating training data for large scale
recognition problems. These methods have achieved great success via learning
intermediate semantic representations in the form of attributes and more
recently, semantic word vectors. However, they have thus far been constrained
to the single-label case, in contrast to the growing popularity and importance
of more realistic multi-label data. In this paper, for the first time, we
investigate and formalise a general framework for multi-label zero-shot
learning, addressing the unique challenge therein: how to exploit multi-label
correlation at test time with no training data for those classes? In
particular, we propose (1) a multi-output deep regression model to project an
image into a semantic word space, which explicitly exploits the correlations in
the intermediate semantic layer of word vectors; (2) a novel zero-shot learning
algorithm for multi-label data that exploits the unique compositionality
property of semantic word vector representations; and (3) a transductive
learning strategy to enable the regression model learned from seen classes to
generalise well to unseen classes. Our zero-shot learning experiments on a
number of standard multi-label datasets demonstrate that our method outperforms
a variety of baselines.Comment: 12 pages, 6 figures, Accepted to BMVC 2014 (oral
Sketch-a-Net that Beats Humans
We propose a multi-scale multi-channel deep neural network framework that,
for the first time, yields sketch recognition performance surpassing that of
humans. Our superior performance is a result of explicitly embedding the unique
characteristics of sketches in our model: (i) a network architecture designed
for sketch rather than natural photo statistics, (ii) a multi-channel
generalisation that encodes sequential ordering in the sketching process, and
(iii) a multi-scale network ensemble with joint Bayesian fusion that accounts
for the different levels of abstraction exhibited in free-hand sketches. We
show that state-of-the-art deep networks specifically engineered for photos of
natural objects fail to perform well on sketch recognition, regardless whether
they are trained using photo or sketch. Our network on the other hand not only
delivers the best performance on the largest human sketch dataset to date, but
also is small in size making efficient training possible using just CPUs.Comment: Accepted to BMVC 2015 (oral
Deep Domain-Adversarial Image Generation for Domain Generalisation
Machine learning models typically suffer from the domain shift problem when
trained on a source dataset and evaluated on a target dataset of different
distribution. To overcome this problem, domain generalisation (DG) methods aim
to leverage data from multiple source domains so that a trained model can
generalise to unseen domains. In this paper, we propose a novel DG approach
based on \emph{Deep Domain-Adversarial Image Generation} (DDAIG). Specifically,
DDAIG consists of three components, namely a label classifier, a domain
classifier and a domain transformation network (DoTNet). The goal for DoTNet is
to map the source training data to unseen domains. This is achieved by having a
learning objective formulated to ensure that the generated data can be
correctly classified by the label classifier while fooling the domain
classifier. By augmenting the source training data with the generated unseen
domain data, we can make the label classifier more robust to unknown domain
changes. Extensive experiments on four DG datasets demonstrate the
effectiveness of our approach.Comment: 8 page
Simple and Effective Stochastic Neural Networks
Stochastic neural networks (SNNs) are currently topical, with several paradigms being actively investigated including dropout, Bayesian neural networks, variational information bottleneck (VIB) and noise regularized learning. These neural network variants impact several major considerations, including generalization, network compression, robustness against adversarial attack and label noise, and model calibration. However, many existing networks are complicated and expensive to train, and/or only address one or two of these practical considerations. In this paper we propose a simple and effective stochastic neural network (SE-SNN) architecture for discriminative learning by directly modeling activation uncertainty and encouraging high activation variability. Compared to existing SNNs, our SE-SNN is simpler to implement and faster to train, and produces state of the art results on network compression by pruning, adversarial defense, learning with label noise, and model calibration
Robust Person Re-identification by Modelling Feature Uncertainty
We aim to learn deep person re-identification (ReID) models that are robust against noisy training data. Two types of noise are prevalent in practice: (1) label noise caused by human annotator errors and (2) data outliers caused by person detector errors or occlusion. Both types of noise pose serious problems for training ReID models, yet have been largely ignored so far. In this paper, we propose a novel deep network termed DistributionNet for robust ReID. Instead of representing each person image as a feature vector, DistributionNet models it as a Gaussian distribution with its variance representing the uncertainty of the extracted features. A carefully designed loss is formulated in DistributionNet to unevenly allocate uncertainty across training samples. Consequently, noisy samples are assigned large variance/uncertainty, which effectively alleviates their negative impacts on model fitting. Extensive experiments demonstrate that our model is more effective than alternative noise-robust deep models. The source code is available at: https://github.com/TianyuanYu/DistributionNet
- …