Search CORE

511,582 research outputs found

Online Learning of a Memory for Learning Rates

Author: Kappler Daniel
Meier Franziska
Schaal Stefan
Publication venue
Publication date: 01/01/2018
Field of study

The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.Comment: accepted to ICRA 2018, code available: https://github.com/fmeier/online-meta-learning ; video pitch available: https://youtu.be/9PzQ25FPPO

arXiv.org e-Print Archive

MPG.PuRe

Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers

Author: B Babenko
D Held
H Grabner
J Schmidhuber
JF Henriques
K Zhang
L Bertinetto
M Kristan
O Russakovsky
S Hare
S Hochreiter
S Thrun
Y Wu
Y-X Wang
Z Kalal
Publication venue
Publication date: 19/03/2018
Field of study

This paper improves state-of-the-art visual object trackers that use online adaptation. Our core contribution is an offline meta-learning-based method to adjust the initial deep networks used in online adaptation-based tracking. The meta learning is driven by the goal of deep networks that can quickly be adapted to robustly model a particular target in future frames. Ideally the resulting models focus on features that are useful for future frames, and avoid overfitting to background clutter, small parts of the target, or noise. By enforcing a small number of update iterations during meta-learning, the resulting networks train significantly faster. We demonstrate this approach on top of the high performance tracking approaches: tracking-by-detection based MDNet and the correlation based CREST. Experimental results on standard benchmarks, OTB2015 and VOT2016, show that our meta-learned versions of both trackers improve speed, accuracy, and robustness.Comment: Code: https://github.com/silverbottlep/meta_tracker

arXiv.org e-Print Archive

Crossref

Online Meta-learning by Parallel Algorithm Competition

Author: Baker James E.
Bertsekas D. P.
Downey Carlton
Gabillon V.
Goodfellow Ian
Mnih Volodymyr
Snoek Jasper
Snoek Jasper
Springenberg Jost T.
Sutton S.
Sutton S.
Szita I.
Unemi T.
Wu Jian
Publication venue
Publication date: 24/02/2017
Field of study

The efficiency of reinforcement learning algorithms depends critically on a few meta-parameters that modulates the learning updates and the trade-off between exploration and exploitation. The adaptation of the meta-parameters is an open question in reinforcement learning, which arguably has become more of an issue recently with the success of deep reinforcement learning in high-dimensional state spaces. The long learning times in domains such as Atari 2600 video games makes it not feasible to perform comprehensive searches of appropriate meta-parameter values. We propose the Online Meta-learning by Parallel Algorithm Competition (OMPAC) method. In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters. After a fixed number of episodes, the instances are selected based on their performance in the task at hand. Before continuing the learning, Gaussian noise is added to the meta-parameters with a predefined probability. We validate the OMPAC method by improving the state-of-the-art results in stochastic SZ-Tetris and in standard Tetris with a smaller, 10

\times

10, board, by 31% and 84%, respectively, and by improving the results for deep Sarsa(

\lambda

) agents in three Atari 2600 games by 62% or more. The experiments also show the ability of the OMPAC method to adapt the meta-parameters according to the learning progress in different tasks.Comment: 15 pages, 10 figures. arXiv admin note: text overlap with arXiv:1702.0311

arXiv.org e-Print Archive

Crossref

Alpha MAML: Adaptive Model-Agnostic Meta-Learning

Author: Baydin Atılım Güneş
Behl Harkirat Singh
Torr Philip H. S.
Publication venue
Publication date: 17/05/2019
Field of study

Model-agnostic meta-learning (MAML) is a meta-learning technique to train a model on a multitude of learning tasks in a way that primes the model for few-shot learning of new tasks. The MAML algorithm performs well on few-shot learning problems in classification, regression, and fine-tuning of policy gradients in reinforcement learning, but comes with the need for costly hyperparameter tuning for training stability. We address this shortcoming by introducing an extension to MAML, called Alpha MAML, to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates. Our results with the Omniglot database demonstrate a substantial reduction in the need to tune MAML training hyperparameters and improvement to training stability with less sensitivity to hyperparameter choice.Comment: 6th ICML Workshop on Automated Machine Learning (2019

arXiv.org e-Print Archive

Oxford University Research Archive

Evaluation of Evidence-Based Practices in Online Learning: A Meta-Analysis and Review of Online Learning Studies

Author: Bakia Marianne
Jones Karla
Means Barbara
Murphy Robert
Toyama Yuki
Publication venue: Centre for Learning Technology
Publication date: 01/01/2009
Field of study

A systematic search of the research literature from 1996 through July 2008 identified more than a thousand empirical studies of online learning. Analysts screened these studies to find those that (a) contrasted an online to a face-to-face condition, (b) measured student learning outcomes, (c) used a rigorous research design, and (d) provided adequate information to calculate an effect size. As a result of this screening, 51 independent effects were identified that could be subjected to meta-analysis. The meta-analysis found that, on average, students in online learning conditions performed better than those receiving face-to-face instruction. The difference between student outcomes for online and face-to-face classes—measured as the difference between treatment and control means, divided by the pooled standard deviation—was larger in those studies contrasting conditions that blended elements of online and face-to-face instruction with conditions taught entirely face-to-face. Analysts noted that these blended conditions often included additional learning time and instructional elements not received by students in control conditions. This finding suggests that the positive effects associated with blended learning should not be attributed to the media, per se. An unexpected finding was the small number of rigorous published studies contrasting online and face-to-face learning conditions for K–12 students. In light of this small corpus, caution is required in generalizing to the K–12 population because the results are derived for the most part from studies in other settings (e.g., medical training, higher education)

ALT Open Access Repository