2 research outputs found
Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training
Meta-learning is a promising method to achieve efficient training method
towards deep neural net and has been attracting increases interests in recent
years. But most of the current methods are still not capable to train complex
neuron net model with long-time training process. In this paper, a novel
second-order meta-optimizer, named Meta-learning with Hessian-Free(MLHF)
approach, is proposed based on the Hessian-Free approach. Two recurrent neural
networks are established to generate the damping and the precondition matrix of
this Hessian-Free framework. A series of techniques to meta-train the MLHF
towards stable and reinforce the meta-training of this optimizer, including the
gradient calculation of . Numerical experiments on deep convolution neural
nets, including CUDA-convnet and ResNet18(v2), with datasets of CIFAR10 and
ILSVRC2012, indicate that the MLHF shows good and continuous training
performance during the whole long-time training process, i.e., both the
rapid-decreasing early stage and the steadily-deceasing later stage, and so is
a promising meta-learning framework towards elevating the training efficiency
in real-world deep neural nets
Model-Agnostic Meta-Learning using Runge-Kutta Methods
Meta-learning has emerged as an important framework for learning new tasks
from just a few examples. The success of any meta-learning model depends on (i)
its fast adaptation to new tasks, as well as (ii) having a shared
representation across similar tasks. Here we extend the model-agnostic
meta-learning (MAML) framework introduced by Finn et al. (2017) to achieve
improved performance by analyzing the temporal dynamics of the optimization
procedure via the Runge-Kutta method. This method enables us to gain
fine-grained control over the optimization and helps us achieve both the
adaptation and representation goals across tasks. By leveraging this refined
control, we demonstrate that there are multiple principled ways to update MAML
and show that the classic MAML optimization is simply a special case of
second-order Runge-Kutta method that mainly focuses on fast-adaptation.
Experiments on benchmark classification, regression and reinforcement learning
tasks show that this refined control helps attain improved results