58,139 research outputs found
An Optimization Framework for Semi-Supervised and Transfer Learning using Multiple Classifiers and Clusterers
Unsupervised models can provide supplementary soft constraints to help
classify new, "target" data since similar instances in the target set are more
likely to share the same class label. Such models can also help detect possible
differences between training and target distributions, which is useful in
applications where concept drift may take place, as in transfer learning
settings. This paper describes a general optimization framework that takes as
input class membership estimates from existing classifiers learnt on previously
encountered "source" data, as well as a similarity matrix from a cluster
ensemble operating solely on the target data to be classified, and yields a
consensus labeling of the target data. This framework admits a wide range of
loss functions and classification/clustering methods. It exploits properties of
Bregman divergences in conjunction with Legendre duality to yield a principled
and scalable approach. A variety of experiments show that the proposed
framework can yield results substantially superior to those provided by popular
transductive learning techniques or by naively applying classifiers learnt on
the original task to the target data
Online Transfer Learning in Reinforcement Learning Domains
This paper proposes an online transfer framework to capture the interaction
among agents and shows that current transfer learning in reinforcement learning
is a special case of online transfer. Furthermore, this paper re-characterizes
existing agents-teaching-agents methods as online transfer and analyze one such
teaching method in three ways. First, the convergence of Q-learning and Sarsa
with tabular representation with a finite budget is proven. Second, the
convergence of Q-learning and Sarsa with linear function approximation is
established. Third, the we show the asymptotic performance cannot be hurt
through teaching. Additionally, all theoretical results are empirically
validated.Comment: 18 pages, 2 figure
Deep Transfer Learning with Joint Adaptation Networks
Deep networks have been successfully applied to learn transferable features
for adapting models from a source domain to a different target domain. In this
paper, we present joint adaptation networks (JAN), which learn a transfer
network by aligning the joint distributions of multiple domain-specific layers
across domains based on a joint maximum mean discrepancy (JMMD) criterion.
Adversarial training strategy is adopted to maximize JMMD such that the
distributions of the source and target domains are made more distinguishable.
Learning can be performed by stochastic gradient descent with the gradients
computed by back-propagation in linear-time. Experiments testify that our model
yields state of the art results on standard datasets.Comment: 34th International Conference on Machine Learnin
Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent's Demonstration
Reinforcement learning has enjoyed multiple successes in recent years.
However, these successes typically require very large amounts of data before an
agent achieves acceptable performance. This paper introduces a novel way of
combating such requirements by leveraging existing (human or agent) knowledge.
In particular, this paper uses demonstrations from agents and humans, allowing
an untrained agent to quickly achieve high performance. We empirically compare
with, and highlight the weakness of, HAT and CHAT, methods of transferring
knowledge from a source agent/human to a target agent. This paper introduces an
effective transfer approach, DRoP, combining the offline knowledge
(demonstrations recorded before learning) with online confidence-based
performance analysis. DRoP dynamically involves the demonstrator's knowledge,
integrating it into the reinforcement learning agent's online learning loop to
achieve efficient and robust learning
Multi-Adversarial Domain Adaptation
Recent advances in deep domain adaptation reveal that adversarial learning
can be embedded into deep networks to learn transferable features that reduce
distribution discrepancy between the source and target domains. Existing domain
adversarial adaptation methods based on single domain discriminator only align
the source and target data distributions without exploiting the complex
multimode structures. In this paper, we present a multi-adversarial domain
adaptation (MADA) approach, which captures multimode structures to enable
fine-grained alignment of different data distributions based on multiple domain
discriminators. The adaptation can be achieved by stochastic gradient descent
with the gradients computed by back-propagation in linear-time. Empirical
evidence demonstrates that the proposed model outperforms state of the art
methods on standard domain adaptation datasets.Comment: AAAI 2018 Oral. arXiv admin note: substantial text overlap with
arXiv:1705.10667, arXiv:1707.0790
Cross-Domain Transfer in Reinforcement Learning using Target Apprentice
In this paper, we present a new approach to Transfer Learning (TL) in
Reinforcement Learning (RL) for cross-domain tasks. Many of the available
techniques approach the transfer architecture as a method of speeding up the
target task learning. We propose to adapt and reuse the mapped source task
optimal-policy directly in related domains. We show the optimal policy from a
related source task can be near optimal in target domain provided an adaptive
policy accounts for the model error between target and source. The main benefit
of this policy augmentation is generalizing policies across multiple related
domains without having to re-learn the new tasks. Our results show that this
architecture leads to better sample efficiency in the transfer, reducing sample
complexity of target task learning to target apprentice learning.Comment: To appear as conference paper in ICRA 201
Task Transfer by Preference-Based Cost Learning
The goal of task transfer in reinforcement learning is migrating the action
policy of an agent to the target task from the source task. Given their
successes on robotic action planning, current methods mostly rely on two
requirements: exactly-relevant expert demonstrations or the explicitly-coded
cost function on target task, both of which, however, are inconvenient to
obtain in practice. In this paper, we relax these two strong conditions by
developing a novel task transfer framework where the expert preference is
applied as a guidance. In particular, we alternate the following two steps:
Firstly, letting experts apply pre-defined preference rules to select related
expert demonstrates for the target task. Secondly, based on the selection
result, we learn the target cost function and trajectory distribution
simultaneously via enhanced Adversarial MaxEnt IRL and generate more
trajectories by the learned target distribution for the next preference
selection. The theoretical analysis on the distribution learning and
convergence of the proposed algorithm are provided. Extensive simulations on
several benchmarks have been conducted for further verifying the effectiveness
of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed
equally to this wor
Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks
We study the optimal rates of convergence for estimating a prior distribution
over a VC class from a sequence of independent data sets respectively labeled
by independent target functions sampled from the prior. We specifically derive
upper and lower bounds on the optimal rates under a smoothness condition on the
correct prior, with the number of samples per data set equal the VC dimension.
These results have implications for the improvements achievable via transfer
learning. We additionally extend this setting to real-valued function, where we
establish consistency of an estimator for the prior, and discuss an additional
application to a preference elicitation problem in algorithmic economics
Decomposition-Based Transfer Distance Metric Learning for Image Classification
Distance metric learning (DML) is a critical factor for image analysis and
pattern recognition. To learn a robust distance metric for a target task, we
need abundant side information (i.e., the similarity/dissimilarity pairwise
constraints over the labeled data), which is usually unavailable in practice
due to the high labeling cost. This paper considers the transfer learning
setting by exploiting the large quantity of side information from certain
related, but different source tasks to help with target metric learning (with
only a little side information). The state-of-the-art metric learning
algorithms usually fail in this setting because the data distributions of the
source task and target task are often quite different. We address this problem
by assuming that the target distance metric lies in the space spanned by the
eigenvectors of the source metrics (or other randomly generated bases). The
target metric is represented as a combination of the base metrics, which are
computed using the decomposed components of the source metrics (or simply a set
of random bases); we call the proposed method, decomposition-based transfer DML
(DTDML). In particular, DTDML learns a sparse combination of the base metrics
to construct the target metric by forcing the target metric to be close to an
integration of the source metrics. The main advantage of the proposed method
compared with existing transfer metric learning approaches is that we directly
learn the base metric coefficients instead of the target metric. To this end,
far fewer variables need to be learned. We therefore obtain more reliable
solutions given the limited side information and the optimization tends to be
faster. Experiments on the popular handwritten image (digit, letter)
classification and challenge natural image annotation tasks demonstrate the
effectiveness of the proposed method
Everything old is new again: A multi-view learning approach to learning using privileged information and distillation
We adopt a multi-view approach for analyzing two knowledge transfer
settings---learning using privileged information (LUPI) and distillation---in a
common framework. Under reasonable assumptions about the complexities of
hypothesis spaces, and being optimistic about the expected loss achievable by
the student (in distillation) and a transformed teacher predictor (in LUPI), we
show that encouraging agreement between the teacher and the student leads to
reduced search space. As a result, improved convergence rate can be obtained
with regularized empirical risk minimization
- …