15,579 research outputs found
Deep Learning-Based Image Kernel for Inductive Transfer
We propose a method to classify images from target classes with a small
number of training examples based on transfer learning from non-target classes.
Without using any more information than class labels for samples from
non-target classes, we train a Siamese net to estimate the probability of two
images to belong to the same class. With some post-processing, output of the
Siamese net can be used to form a gram matrix of a Mercer kernel. Coupled with
a support vector machine (SVM), such a kernel gave reasonable classification
accuracy on target classes without any fine-tuning. When the Siamese net was
only partially fine-tuned using a small number of samples from the target
classes, the resulting classifier outperformed the state-of-the-art and other
alternatives. We share class separation capabilities and insights into the
learning process of such a kernel on MNIST, Dogs vs. Cats, and CIFAR-10
datasets
Optimal Transport for Deep Joint Transfer Learning
Training a Deep Neural Network (DNN) from scratch requires a large amount of
labeled data. For a classification task where only small amount of training
data is available, a common solution is to perform fine-tuning on a DNN which
is pre-trained with related source data. This consecutive training process is
time consuming and does not consider explicitly the relatedness between
different source and target tasks.
In this paper, we propose a novel method to jointly fine-tune a Deep Neural
Network with source data and target data. By adding an Optimal Transport loss
(OT loss) between source and target classifier predictions as a constraint on
the source classifier, the proposed Joint Transfer Learning Network (JTLN) can
effectively learn useful knowledge for target classification from source data.
Furthermore, by using different kind of metric as cost matrix for the OT loss,
JTLN can incorporate different prior knowledge about the relatedness between
target categories and source categories.
We carried out experiments with JTLN based on Alexnet on image classification
datasets and the results verify the effectiveness of the proposed JTLN in
comparison with standard consecutive fine-tuning. This Joint Transfer Learning
with OT loss is general and can also be applied to other kind of Neural
Networks
A Simple Exponential Family Framework for Zero-Shot Learning
We present a simple generative framework for learning to predict previously
unseen classes, based on estimating class-attribute-gated class-conditional
distributions. We model each class-conditional distribution as an exponential
family distribution and the parameters of the distribution of each seen/unseen
class are defined as functions of the respective observed class attributes.
These functions can be learned using only the seen class data and can be used
to predict the parameters of the class-conditional distribution of each unseen
class. Unlike most existing methods for zero-shot learning that represent
classes as fixed embeddings in some vector space, our generative model
naturally represents each class as a probability distribution. It is simple to
implement and also allows leveraging additional unlabeled data from unseen
classes to improve the estimates of their class-conditional distributions using
transductive/semi-supervised learning. Moreover, it extends seamlessly to
few-shot learning by easily updating these distributions when provided with a
small number of additional labelled examples from unseen classes. Through a
comprehensive set of experiments on several benchmark data sets, we demonstrate
the efficacy of our framework.Comment: Accepted in ECML-PKDD 2017, 16 Pages: Code and Data are available:
https://github.com/vkverma01/Zero-Shot
Transfer Metric Learning: Algorithms, Applications and Outlooks
Distance metric learning (DML) aims to find an appropriate way to reveal the
underlying data relationship. It is critical in many machine learning, pattern
recognition and data mining algorithms, and usually require large amount of
label information (such as class labels or pair/triplet constraints) to achieve
satisfactory performance. However, the label information may be insufficient in
real-world applications due to the high-labeling cost, and DML may fail in this
case. Transfer metric learning (TML) is able to mitigate this issue for DML in
the domain of interest (target domain) by leveraging knowledge/information from
other related domains (source domains). Although achieved a certain level of
development, TML has limited success in various aspects such as selective
transfer, theoretical understanding, handling complex data, big data and
extreme cases. In this survey, we present a systematic review of the TML
literature. In particular, we group TML into different categories according to
different settings and metric transfer strategies, such as direct metric
approximation, subspace approximation, distance approximation, and distribution
approximation. A summarization and insightful discussion of the various TML
approaches and their applications will be presented. Finally, we indicate some
challenges and provide possible future directions.Comment: 14 pages, 5 figure
Adapted Deep Embeddings: A Synthesis of Methods for -Shot Inductive Transfer Learning
The focus in machine learning has branched beyond training classifiers on a
single task to investigating how previously acquired knowledge in a source
domain can be leveraged to facilitate learning in a related target domain,
known as inductive transfer learning. Three active lines of research have
independently explored transfer learning using neural networks. In weight
transfer, a model trained on the source domain is used as an initialization
point for a network to be trained on the target domain. In deep metric
learning, the source domain is used to construct an embedding that captures
class structure in both the source and target domains. In few-shot learning,
the focus is on generalizing well in the target domain based on a limited
number of labeled examples. We compare state-of-the-art methods from these
three paradigms and also explore hybrid adapted-embedding methods that use
limited target-domain data to fine tune embeddings constructed from
source-domain data. We conduct a systematic comparison of methods in a variety
of domains, varying the number of labeled instances available in the target
domain (), as well as the number of target-domain classes. We reach three
principal conclusions: (1) Deep embeddings are far superior, compared to weight
transfer, as a starting point for inter-domain transfer or model re-use (2) Our
hybrid methods robustly outperform every few-shot learning and every deep
metric learning method previously proposed, with a mean error reduction of 34%
over state-of-the-art. (3) Among loss functions for discovering embeddings, the
histogram loss (Ustinova & Lempitsky, 2016) is most robust. We hope our results
will motivate a unification of research in weight transfer, deep metric
learning, and few-shot learning
Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition
Through this project, we researched on transfer learning methods and their
applications on real world problems. By implementing and modifying various
methods in transfer learning for our problem, we obtained an insight in the
advantages and disadvantages of these methods, as well as experiences in
developing neural network models for knowledge transfer. Due to time
constraint, we only applied a representative method for each major approach in
transfer learning. As pointed out in the literature review, each method has its
own assumptions, strengths and shortcomings. Thus we believe that an
ensemble-learning approach combining the different methods should yield a
better performance, which can be our future research focus
Learning Tensors in Reproducing Kernel Hilbert Spaces with Multilinear Spectral Penalties
We present a general framework to learn functions in tensor product
reproducing kernel Hilbert spaces (TP-RKHSs). The methodology is based on a
novel representer theorem suitable for existing as well as new spectral
penalties for tensors. When the functions in the TP-RKHS are defined on the
Cartesian product of finite discrete sets, in particular, our main problem
formulation admits as a special case existing tensor completion problems. Other
special cases include transfer learning with multimodal side information and
multilinear multitask learning. For the latter case, our kernel-based view is
instrumental to derive nonlinear extensions of existing model classes. We give
a novel algorithm and show in experiments the usefulness of the proposed
extensions
Fast Adaptation with Linearized Neural Networks
The inductive biases of trained neural networks are difficult to understand
and, consequently, to adapt to new settings. We study the inductive biases of
linearizations of neural networks, which we show to be surprisingly good
summaries of the full network functions. Inspired by this finding, we propose a
technique for embedding these inductive biases into Gaussian processes through
a kernel designed from the Jacobian of the network. In this setting, domain
adaptation takes the form of interpretable posterior inference, with
accompanying uncertainty estimation. This inference is analytic and free of
local optima issues found in standard techniques such as fine-tuning neural
network weights to a new task. We develop significant computational speed-ups
based on matrix multiplies, including a novel implementation for scalable
Fisher vector products. Our experiments on both image classification and
regression demonstrate the promise and convenience of this framework for
transfer learning, compared to neural network fine-tuning. Code is available at
https://github.com/amzn/xfer/tree/master/finite_ntk.Comment: AISTATS 202
Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance
Large amounts of labeled data are typically required to train deep learning
models. For many real-world problems, however, acquiring additional data can be
expensive or even impossible. We present semi-supervised deep kernel learning
(SSDKL), a semi-supervised regression model based on minimizing predictive
variance in the posterior regularization framework. SSDKL combines the
hierarchical representation learning of neural networks with the probabilistic
modeling capabilities of Gaussian processes. By leveraging unlabeled data, we
show improvements on a diverse set of real-world regression tasks over
supervised deep kernel learning and semi-supervised methods such as VAT and
mean teacher adapted for regression.Comment: In Proceedings of Neural Information Processing Systems (NeurIPS)
201
On the characterization of the numbers such that any group of order has a given property
One of the classical problems in group theory is determining the set of
positive integers such that every group of order has a particular
property , such as cyclic or abelian. We first present the Sylow theorems
and the idea of solvable groups, both of which will be invaluable in our
analysis. We then gather various solutions to this problem for cyclic, abelian,
nilpotent, and supersolvable groups, as well as groups with ordered Sylow
towers.
This work is an exposition of known results, but it is hoped that the reader
will find useful the presentation in a single account of the various tools that
have been used to solve this general problem. This article claims no
originality, but is meant as a synthesis of related knowledge and resources.Comment: Undergraduate Honors Thesis in Mathematic
- …