4,178 research outputs found
Scalable Greedy Algorithms for Transfer Learning
In this paper we consider the binary transfer learning problem, focusing on
how to select and combine sources from a large pool to yield a good performance
on a target task. Constraining our scenario to real world, we do not assume the
direct access to the source data, but rather we employ the source hypotheses
trained from them. We propose an efficient algorithm that selects relevant
source hypotheses and feature dimensions simultaneously, building on the
literature on the best subset selection problem. Our algorithm achieves
state-of-the-art results on three computer vision datasets, substantially
outperforming both transfer learning and popular feature selection baselines in
a small-sample setting. We also present a randomized variant that achieves the
same results with the computational cost independent from the number of source
hypotheses and feature dimensions. Also, we theoretically prove that, under
reasonable assumptions on the source hypotheses, our algorithm can learn
effectively from few examples
Distral: Robust Multitask Reinforcement Learning
Most deep reinforcement learning algorithms are data inefficient in complex
and rich environments, limiting their applicability to many scenarios. One
direction for improving data efficiency is multitask learning with shared
neural network parameters, where efficiency may be improved through transfer
across related tasks. In practice, however, this is not usually observed,
because gradients from different tasks can interfere negatively, making
learning unstable and sometimes even less data efficient. Another issue is the
different reward schemes between tasks, which can easily lead to one task
dominating the learning of a shared model. We propose a new approach for joint
training of multiple tasks, which we refer to as Distral (Distill & transfer
learning). Instead of sharing parameters between the different workers, we
propose to share a "distilled" policy that captures common behaviour across
tasks. Each worker is trained to solve its own task while constrained to stay
close to the shared policy, while the shared policy is trained by distillation
to be the centroid of all task policies. Both aspects of the learning process
are derived by optimizing a joint objective function. We show that our approach
supports efficient transfer on complex 3D environments, outperforming several
related methods. Moreover, the proposed learning process is more robust and
more stable---attributes that are critical in deep reinforcement learning
Transfer learning through greedy subset selection
We study the binary transfer learning problem, focusing on how to select sources from a large pool and how to combine them to yield a good performance on a target task. In particular, we consider the transfer learning setting where one does not have direct access to the source data, but rather employs the source hypotheses trained from them. Building on the literature on the best subset selection problem, we propose an efficient algorithm that selects relevant source hypotheses and feature dimensions simultaneously. On three computer vision datasets we achieve state-of-the-art results, substantially outperforming transfer learning and popular feature selection baselines in a small-sample setting. Also, we theoretically prove that, under reasonable assumptions on the source hypotheses, our algorithm can learn effectively from few examples
Learning to Invert: Signal Recovery via Deep Convolutional Networks
The promise of compressive sensing (CS) has been offset by two significant
challenges. First, real-world data is not exactly sparse in a fixed basis.
Second, current high-performance recovery algorithms are slow to converge,
which limits CS to either non-real-time applications or scenarios where massive
back-end computing is available. In this paper, we attack both of these
challenges head-on by developing a new signal recovery framework we call {\em
DeepInverse} that learns the inverse transformation from measurement vectors to
signals using a {\em deep convolutional network}. When trained on a set of
representative images, the network learns both a representation for the signals
(addressing challenge one) and an inverse map approximating a greedy or convex
recovery algorithm (addressing challenge two). Our experiments indicate that
the DeepInverse network closely approximates the solution produced by
state-of-the-art CS recovery algorithms yet is hundreds of times faster in run
time. The tradeoff for the ultrafast run time is a computationally intensive,
off-line training procedure typical to deep networks. However, the training
needs to be completed only once, which makes the approach attractive for a host
of sparse recovery problems.Comment: Accepted at The 42nd IEEE International Conference on Acoustics,
Speech and Signal Processin
- …