211,120 research outputs found
Knowledge Distillation Using Hierarchical Self-Supervision Augmented Distribution
Knowledge distillation (KD) is an effective framework that aims to transfer
meaningful information from a large teacher to a smaller student. Generally, KD
often involves how to define and transfer knowledge. Previous KD methods often
focus on mining various forms of knowledge, for example, feature maps and
refined information. However, the knowledge is derived from the primary
supervised task and thus is highly task-specific. Motivated by the recent
success of self-supervised representation learning, we propose an auxiliary
self-supervision augmented task to guide networks to learn more meaningful
features. Therefore, we can derive soft self-supervision augmented
distributions as richer dark knowledge from this task for KD. Unlike previous
knowledge, this distribution encodes joint knowledge from supervised and
self-supervised feature learning. Beyond knowledge exploration, we propose to
append several auxiliary branches at various hidden layers, to fully take
advantage of hierarchical feature maps. Each auxiliary branch is guided to
learn self-supervision augmented task and distill this distribution from
teacher to student. Overall, we call our KD method as Hierarchical
Self-Supervision Augmented Knowledge Distillation (HSSAKD). Experiments on
standard image classification show that both offline and online HSSAKD achieves
state-of-the-art performance in the field of KD. Further transfer experiments
on object detection further verify that HSSAKD can guide the network to learn
better features. The code is available at https://github.com/winycg/HSAKD.Comment: 15 pages, Accepted by IEEE Transactions on Neural Networks and
Learning Systems 202
Learning to see across domains and modalities
Deep learning has recently raised hopes and expectations as a general solution for many applications (computer vision, natural language processing, speech recognition, etc.); indeed it has proven effective, but it also showed a strong dependence on large quantities of data. Generally speaking, deep learning models are especially susceptible to overfitting, due to their large number of internal parameters.
Luckily, it has also been shown that, even when data is scarce, a successful model can be trained by reusing prior knowledge. Thus, developing techniques for \textit{transfer learning} (as this process is known), in its broadest definition, is a crucial element towards the deployment of effective and accurate intelligent systems into the real world.
This thesis will focus on a family of transfer learning methods applied to the task of visual object recognition, specifically image classification. The visual recognition problem is central to computer vision research: many desired applications, from robotics to information retrieval, demand the ability to correctly identify categories, places, and objects.
Transfer learning is a general term, and specific settings have been given specific names: when the learner has access to only unlabeled data from the target domain (where the model should perform) and labeled data from a different domain (the source), the problem is called unsupervised domain adaptation (DA). The first part of this thesis will focus on three methods for this setting.
The three presented techniques for domain adaptation are fully distinct: the first one proposes the use of Domain Alignment layers to structurally align the distributions of the source and target domains in feature space. While the general idea of aligning feature distribution is not novel,
we distinguish our method by being one of the very few that do so without adding losses. The second method is based on GANs: we propose a bidirectional architecture that jointly learns how to map the source images into the target visual style and vice-versa, thus alleviating the domain shift at the pixel level. The third method features an adversarial learning process that transforms both the images and the features of both domains in order to map them to a common, agnostic, space.
While the first part of the thesis presented general purpose DA methods, the second part will focus on the real life issues of robotic perception, specifically RGB-D recognition.
Robotic platforms are usually not limited to color perception; very often they also carry a Depth camera.
Unfortunately, the depth modality is rarely used for visual recognition due to the lack of pretrained models from which to transfer and little data to train one on from scratch.
We will first explore the use of synthetic data as proxy for real images by training a Convolutional Neural Network (CNN) on virtual depth maps, rendered from 3D CAD models, and then testing it on real robotic datasets. The second approach leverages the existence of RGB pretrained models, by learning how to map the depth data into the most discriminative RGB representation and then using existing models for recognition. This second technique is actually a pretty generic Transfer Learning method which can be applied to share knowledge across modalities
CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
In this paper, we study the problem of learning image classification models
with label noise. Existing approaches depending on human supervision are
generally not scalable as manually identifying correct or incorrect labels is
time-consuming, whereas approaches not relying on human supervision are
scalable but less effective. To reduce the amount of human supervision for
label noise cleaning, we introduce CleanNet, a joint neural embedding network,
which only requires a fraction of the classes being manually verified to
provide the knowledge of label noise that can be transferred to other classes.
We further integrate CleanNet and conventional convolutional neural network
classifier into one framework for image classification learning. We demonstrate
the effectiveness of the proposed algorithm on both of the label noise
detection task and the image classification on noisy data task on several
large-scale datasets. Experimental results show that CleanNet can reduce label
noise detection error rate on held-out classes where no human supervision
available by 41.5% compared to current weakly supervised methods. It also
achieves 47% of the performance gain of verifying all images with only 3.2%
images verified on an image classification task. Source code and dataset will
be available at kuanghuei.github.io/CleanNetProject.Comment: Accepted to CVPR 201
Feature Representation Analysis of Deep Convolutional Neural Network using Two-stage Feature Transfer -An Application for Diffuse Lung Disease Classification-
Transfer learning is a machine learning technique designed to improve
generalization performance by using pre-trained parameters obtained from other
learning tasks. For image recognition tasks, many previous studies have
reported that, when transfer learning is applied to deep neural networks,
performance improves, despite having limited training data. This paper proposes
a two-stage feature transfer learning method focusing on the recognition of
textural medical images. During the proposed method, a model is successively
trained with massive amounts of natural images, some textural images, and the
target images. We applied this method to the classification task of textural
X-ray computed tomography images of diffuse lung diseases. In our experiment,
the two-stage feature transfer achieves the best performance compared to a
from-scratch learning and a conventional single-stage feature transfer. We also
investigated the robustness of the target dataset, based on size. Two-stage
feature transfer shows better robustness than the other two learning methods.
Moreover, we analyzed the feature representations obtained from DLDs imagery
inputs for each feature transfer models using a visualization method. We showed
that the two-stage feature transfer obtains both edge and textural features of
DLDs, which does not occur in conventional single-stage feature transfer
models.Comment: Preprint of the journal article to be published in IPSJ TOM-51.
Notice for the use of this material The copyright of this material is
retained by the Information Processing Society of Japan (IPSJ). This material
is published on this web site with the agreement of the author (s) and the
IPS
Learning models for semantic classification of insufficient plantar pressure images
Establishing a reliable and stable model to predict a target by using insufficient labeled samples is feasible and
effective, particularly, for a sensor-generated data-set. This paper has been inspired with insufficient data-set
learning algorithms, such as metric-based, prototype networks and meta-learning, and therefore we propose
an insufficient data-set transfer model learning method. Firstly, two basic models for transfer learning are
introduced. A classification system and calculation criteria are then subsequently introduced. Secondly, a dataset
of plantar pressure for comfort shoe design is acquired and preprocessed through foot scan system; and by
using a pre-trained convolution neural network employing AlexNet and convolution neural network (CNN)-
based transfer modeling, the classification accuracy of the plantar pressure images is over 93.5%. Finally,
the proposed method has been compared to the current classifiers VGG, ResNet, AlexNet and pre-trained
CNN. Also, our work is compared with known-scaling and shifting (SS) and unknown-plain slot (PS) partition
methods on the public test databases: SUN, CUB, AWA1, AWA2, and aPY with indices of precision (tr, ts, H)
and time (training and evaluation). The proposed method for the plantar pressure classification task shows high
performance in most indices when comparing with other methods. The transfer learning-based method can be
applied to other insufficient data-sets of sensor imaging fields
- …