Search CORE

4 research outputs found

A Sample Complexity Separation between Non-Convex and Convex Meta-Learning

Author: Arora Sanjeev
Khodak Mikhail
Saunshi Nikunj
Zhang Yi
Publication venue
Publication date: 01/01/2020
Field of study

One popular trend in meta-learning is to learn from many training tasks a common initialization for a gradient-based method that can be used to solve a new task with few samples. The theory of meta-learning is still in its early stages, with several recent learning-theoretic analyses of methods such as Reptile [Nichol et al., 2018] being for convex models. This work shows that convex-case analysis might be insufficient to understand the success of meta-learning, and that even for non-convex models it is important to look inside the optimization black-box, specifically at properties of the optimization trajectory. We construct a simple meta-learning instance that captures the problem of one-dimensional subspace learning. For the convex formulation of linear regression on this instance, we show that the new task sample complexity of any initialization-based meta-learning algorithm is

\Omega(d)

, where

d

is the input dimension. In contrast, for the non-convex formulation of a two layer linear network on the same instance, we show that both Reptile and multi-task representation learning can have new task sample complexity of

\mathcal{O}(1)

, demonstrating a separation from convex meta-learning. Crucially, analyses of the training dynamics of these methods reveal that they can meta-learn the correct subspace onto which the data should be projected.Comment: 34 page

arXiv.org e-Print Archive

Princeton University Open Access Repository