32,039 research outputs found
Fast Context Adaptation via Meta-Learning
We propose CAVIA for meta-learning, a simple extension to MAML that is less
prone to meta-overfitting, easier to parallelise, and more interpretable. CAVIA
partitions the model parameters into two parts: context parameters that serve
as additional input to the model and are adapted on individual tasks, and
shared parameters that are meta-trained and shared across tasks. At test time,
only the context parameters are updated, leading to a low-dimensional task
representation. We show empirically that CAVIA outperforms MAML for regression,
classification, and reinforcement learning. Our experiments also highlight
weaknesses in current benchmarks, in that the amount of adaptation needed in
some cases is small.Comment: Published at the International Conference on Machine Learning (ICML)
201
Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real-Time
This paper investigates how to utilize different forms of human interaction
to safely train autonomous systems in real-time by learning from both human
demonstrations and interventions. We implement two components of the
Cycle-of-Learning for Autonomous Systems, which is our framework for combining
multiple modalities of human interaction. The current effort employs human
demonstrations to teach a desired behavior via imitation learning, then
leverages intervention data to correct for undesired behaviors produced by the
imitation learner to teach novel tasks to an autonomous agent safely, after
only minutes of training. We demonstrate this method in an autonomous perching
task using a quadrotor with continuous roll, pitch, yaw, and throttle commands
and imagery captured from a downward-facing camera in a high-fidelity simulated
environment. Our method improves task completion performance for the same
amount of human interaction when compared to learning from demonstrations
alone, while also requiring on average 32% less data to achieve that
performance. This provides evidence that combining multiple modes of human
interaction can increase both the training speed and overall performance of
policies for autonomous systems.Comment: 9 pages, 6 figure
- …