9,998 research outputs found
SEVEN: Deep Semi-supervised Verification Networks
Verification determines whether two samples belong to the same class or not,
and has important applications such as face and fingerprint verification, where
thousands or millions of categories are present but each category has scarce
labeled examples, presenting two major challenges for existing deep learning
models. We propose a deep semi-supervised model named SEmi-supervised
VErification Network (SEVEN) to address these challenges. The model consists of
two complementary components. The generative component addresses the lack of
supervision within each category by learning general salient structures from a
large amount of data across categories. The discriminative component exploits
the learned general features to mitigate the lack of supervision within
categories, and also directs the generative component to find more informative
structures of the whole data manifold. The two components are tied together in
SEVEN to allow an end-to-end training of the two components. Extensive
experiments on four verification tasks demonstrate that SEVEN significantly
outperforms other state-of-the-art deep semi-supervised techniques when labeled
data are in short supply. Furthermore, SEVEN is competitive with fully
supervised baselines trained with a larger amount of labeled data. It indicates
the importance of the generative component in SEVEN.Comment: 7 pages, 2 figures, accepted to the 2017 International Joint
Conference on Artificial Intelligence (IJCAI-17
f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning
When labeled training data is scarce, a promising data augmentation approach
is to generate visual features of unknown classes using their attributes. To
learn the class conditional distribution of CNN features, these models rely on
pairs of image features and class attributes. Hence, they can not make use of
the abundance of unlabeled data samples. In this paper, we tackle any-shot
learning problems i.e. zero-shot and few-shot, in a unified feature generating
framework that operates in both inductive and transductive learning settings.
We develop a conditional generative model that combines the strength of VAE and
GANs and in addition, via an unconditional discriminator, learns the marginal
feature distribution of unlabeled images. We empirically show that our model
learns highly discriminative CNN features for five datasets, i.e. CUB, SUN, AWA
and ImageNet, and establish a new state-of-the-art in any-shot learning, i.e.
inductive and transductive (generalized) zero- and few-shot learning settings.
We also demonstrate that our learned features are interpretable: we visualize
them by inverting them back to the pixel space and we explain them by
generating textual arguments of why they are associated with a certain label.Comment: Accepted at CVPR 201
Perceptual Generative Adversarial Networks for Small Object Detection
Detecting small objects is notoriously challenging due to their low
resolution and noisy representation. Existing object detection pipelines
usually detect small objects through learning representations of all the
objects at multiple scales. However, the performance gain of such ad hoc
architectures is usually limited to pay off the computational cost. In this
work, we address the small object detection problem by developing a single
architecture that internally lifts representations of small objects to
"super-resolved" ones, achieving similar characteristics as large objects and
thus more discriminative for detection. For this purpose, we propose a new
Perceptual Generative Adversarial Network (Perceptual GAN) model that improves
small object detection through narrowing representation difference of small
objects from the large ones. Specifically, its generator learns to transfer
perceived poor representations of the small objects to super-resolved ones that
are similar enough to real large objects to fool a competing discriminator.
Meanwhile its discriminator competes with the generator to identify the
generated representation and imposes an additional perceptual requirement -
generated representations of small objects must be beneficial for detection
purpose - on the generator. Extensive evaluations on the challenging
Tsinghua-Tencent 100K and the Caltech benchmark well demonstrate the
superiority of Perceptual GAN in detecting small objects, including traffic
signs and pedestrians, over well-established state-of-the-arts
Generating Visual Representations for Zero-Shot Classification
This paper addresses the task of learning an image clas-sifier when some
categories are defined by semantic descriptions only (e.g. visual attributes)
while the others are defined by exemplar images as well. This task is often
referred to as the Zero-Shot classification task (ZSC). Most of the previous
methods rely on learning a common embedding space allowing to compare visual
features of unknown categories with semantic descriptions. This paper argues
that these approaches are limited as i) efficient discrimi-native classifiers
can't be used ii) classification tasks with seen and unseen categories
(Generalized Zero-Shot Classification or GZSC) can't be addressed efficiently.
In contrast , this paper suggests to address ZSC and GZSC by i) learning a
conditional generator using seen classes ii) generate artificial training
examples for the categories without exemplars. ZSC is then turned into a
standard supervised learning problem. Experiments with 4 generative models and
5 datasets experimentally validate the approach, giving state-of-the-art
results on both ZSC and GZSC
- …