8,707 research outputs found

    Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference

    Get PDF
    Few-shot learning (FSL) is an important and topical problem in computer vision that has motivated extensive research into numerous methods spanning from sophisticated meta-learning methods to simple transfer learning baselines. We seek to push the limits of a simple-but-effective pipeline for more realistic and practical settings of few-shot image classification. To this end, we explore few-shot learning from the perspective of neural network architecture, as well as a three stage pipeline of network updates under different data supplies, where unsupervised external data is considered for pre-training, base categories are used to simulate few-shot tasks for meta-training, and the scarcely labelled data of an novel task is taken for fine-tuning. We investigate questions such as: (1) How pre-training on external data benefits FSL? (2) How state-of-the-art transformer architectures can be exploited? and (3) How fine-tuning mitigates domain shift? Ultimately, we show that a simple transformer-based pipeline yields surprisingly good performance on standard benchmarks such as Mini-ImageNet, CIFAR-FS, CDFSL and Meta-Dataset. Our code and demo are available at https://hushell.github.io/pmf.Comment: Accepted by CVPR202

    How to train your MAML

    Get PDF
    The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.Comment: Published in ICLR 201

    Neuromorphic Few-Shot Learning: Generalization in Multilayer Physical Neural Networks

    Full text link
    Neuromorphic computing leverages the complex dynamics of physical systems for computation. The field has recently undergone an explosion in the range and sophistication of implementations, with rapidly improving performance. Neuromorphic schemes typically employ a single physical system, limiting the dimensionality and range of available dynamics - restricting strong performance to a few specific tasks. This is a critical roadblock facing the field, inhibiting the power and versatility of neuromorphic schemes. Here, we present a solution. We engineer a diverse suite of nanomagnetic arrays and show how tuning microstate space and geometry enables a broad range of dynamics and computing performance. We interconnect arrays in parallel, series and multilayered neural network architectures, where each network node is a distinct physical system. This networked approach grants extremely high dimensionality and enriched dynamics enabling meta-learning to be implemented on small training sets and exhibiting strong performance across a broad taskset. We showcase network performance via few-shot learning, rapidly adapting on-the-fly to previously unseen tasks

    Feature Extractor Stacking for Cross-domain Few-shot Meta-learning

    Full text link
    Cross-domain few-shot meta-learning (CDFSML) addresses learning problems where knowledge needs to be transferred from several source domains into an instance-scarce target domain with an explicitly different distribution. Recently published CDFSML methods generally construct a universal model that combines knowledge of multiple source domains into one backbone feature extractor. This enables efficient inference but necessitates re-computation of the backbone whenever a new source domain is added. Some of these methods are also incompatible with heterogeneous source domain backbone architectures. We propose feature extractor stacking (FES), a new CDFSML method for combining information from a collection of backbones, which can utilise heterogeneous pretrained backbones out of the box, and does not maintain a universal model that needs to be re-computed when its backbone collection is updated. We present the basic FES algorithm, which is inspired by the classic stacking approach to meta-learning, and also introduce two variants: convolutional FES (ConFES) and regularised FES (ReFES). Given a target-domain task, these algorithms fine-tune each backbone independently, use cross-validation to extract meta training data from the support set, and learn a simple linear meta-classifier from this data. We evaluate our FES methods on the well-known Meta-Dataset benchmark, targeting image classification with convolutional neural networks, and show that they can achieve state-of-the-art performance
    • …
    corecore