8,707 research outputs found
Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference
Few-shot learning (FSL) is an important and topical problem in computer
vision that has motivated extensive research into numerous methods spanning
from sophisticated meta-learning methods to simple transfer learning baselines.
We seek to push the limits of a simple-but-effective pipeline for more
realistic and practical settings of few-shot image classification. To this end,
we explore few-shot learning from the perspective of neural network
architecture, as well as a three stage pipeline of network updates under
different data supplies, where unsupervised external data is considered for
pre-training, base categories are used to simulate few-shot tasks for
meta-training, and the scarcely labelled data of an novel task is taken for
fine-tuning. We investigate questions such as: (1) How pre-training on external
data benefits FSL? (2) How state-of-the-art transformer architectures can be
exploited? and (3) How fine-tuning mitigates domain shift? Ultimately, we show
that a simple transformer-based pipeline yields surprisingly good performance
on standard benchmarks such as Mini-ImageNet, CIFAR-FS, CDFSL and Meta-Dataset.
Our code and demo are available at https://hushell.github.io/pmf.Comment: Accepted by CVPR202
How to train your MAML
The field of few-shot learning has recently seen substantial advancements.
Most of these advancements came from casting few-shot learning as a
meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of
the best approaches for few-shot learning via meta-learning. MAML is simple,
elegant and very powerful, however, it has a variety of issues, such as being
very sensitive to neural network architectures, often leading to instability
during training, requiring arduous hyperparameter searches to stabilize
training and achieve high generalization and being very computationally
expensive at both training and inference times. In this paper, we propose
various modifications to MAML that not only stabilize the system, but also
substantially improve the generalization performance, convergence speed and
computational overhead of MAML, which we call MAML++.Comment: Published in ICLR 201
Neuromorphic Few-Shot Learning: Generalization in Multilayer Physical Neural Networks
Neuromorphic computing leverages the complex dynamics of physical systems for
computation. The field has recently undergone an explosion in the range and
sophistication of implementations, with rapidly improving performance.
Neuromorphic schemes typically employ a single physical system, limiting the
dimensionality and range of available dynamics - restricting strong performance
to a few specific tasks. This is a critical roadblock facing the field,
inhibiting the power and versatility of neuromorphic schemes.
Here, we present a solution. We engineer a diverse suite of nanomagnetic
arrays and show how tuning microstate space and geometry enables a broad range
of dynamics and computing performance. We interconnect arrays in parallel,
series and multilayered neural network architectures, where each network node
is a distinct physical system. This networked approach grants extremely high
dimensionality and enriched dynamics enabling meta-learning to be implemented
on small training sets and exhibiting strong performance across a broad
taskset. We showcase network performance via few-shot learning, rapidly
adapting on-the-fly to previously unseen tasks
Feature Extractor Stacking for Cross-domain Few-shot Meta-learning
Cross-domain few-shot meta-learning (CDFSML) addresses learning problems
where knowledge needs to be transferred from several source domains into an
instance-scarce target domain with an explicitly different distribution.
Recently published CDFSML methods generally construct a universal model that
combines knowledge of multiple source domains into one backbone feature
extractor. This enables efficient inference but necessitates re-computation of
the backbone whenever a new source domain is added. Some of these methods are
also incompatible with heterogeneous source domain backbone architectures. We
propose feature extractor stacking (FES), a new CDFSML method for combining
information from a collection of backbones, which can utilise heterogeneous
pretrained backbones out of the box, and does not maintain a universal model
that needs to be re-computed when its backbone collection is updated. We
present the basic FES algorithm, which is inspired by the classic stacking
approach to meta-learning, and also introduce two variants: convolutional FES
(ConFES) and regularised FES (ReFES). Given a target-domain task, these
algorithms fine-tune each backbone independently, use cross-validation to
extract meta training data from the support set, and learn a simple linear
meta-classifier from this data. We evaluate our FES methods on the well-known
Meta-Dataset benchmark, targeting image classification with convolutional
neural networks, and show that they can achieve state-of-the-art performance
- …