44,108 research outputs found
Adaptive Parametric Prototype Learning for Cross-Domain Few-Shot Classification
Cross-domain few-shot classification induces a much more challenging problem
than its in-domain counterpart due to the existence of domain shifts between
the training and test tasks. In this paper, we develop a novel Adaptive
Parametric Prototype Learning (APPL) method under the meta-learning convention
for cross-domain few-shot classification. Different from existing prototypical
few-shot methods that use the averages of support instances to calculate the
class prototypes, we propose to learn class prototypes from the concatenated
features of the support set in a parametric fashion and meta-learn the model by
enforcing prototype-based regularization on the query set. In addition, we
fine-tune the model in the target domain in a transductive manner using a
weighted-moving-average self-training approach on the query instances. We
conduct experiments on multiple cross-domain few-shot benchmark datasets. The
empirical results demonstrate that APPL yields superior performance than many
state-of-the-art cross-domain few-shot learning methods
Task-aware Adaptive Learning for Cross-domain Few-shot Learning
Although existing few-shot learning works yield promising results for in-domain queries, they still suffer from weak cross-domain generalization. Limited support data requires effective knowledge transfer, but domain-shift makes this harder. Towards this emerging challenge, researchers improved adaptation by introducing task-specific parameters, which are directly optimized and estimated for each task. However, adding a fixed number of additional parameters fails to consider the diverse domain shifts between target tasks and the source domain, limiting efficacy. In this paper, we first observe the dependence of task-specific parameter configuration on the target task. Abundant task-specific parameters may over-fit, and insufficient task-specific parameters may result in under-adaptation -- but the optimal task-specific configuration varies for different test tasks. Based on these findings, we propose the Task-aware Adaptive Network (TA2-Net), which is trained by reinforcement learning to adaptively estimate the optimal task-specific parameter configuration for each test task. It learns, for example, that tasks with significant domain shift usually have a larger need for task-specific parameters for adaptation. We evaluate our model on Meta-dataset. Empirical results show that our model outperforms existing state-of-the-art methods
APP: Adaptive Prototypical Pseudo-Labeling for Few-shot OOD Detection
Detecting out-of-domain (OOD) intents from user queries is essential for a
task-oriented dialogue system. Previous OOD detection studies generally work on
the assumption that plenty of labeled IND intents exist. In this paper, we
focus on a more practical few-shot OOD setting where there are only a few
labeled IND data and massive unlabeled mixed data that may belong to IND or
OOD. The new scenario carries two key challenges: learning discriminative
representations using limited IND data and leveraging unlabeled mixed data.
Therefore, we propose an adaptive prototypical pseudo-labeling (APP) method for
few-shot OOD detection, including a prototypical OOD detection framework
(ProtoOOD) to facilitate low-resource OOD detection using limited IND data, and
an adaptive pseudo-labeling method to produce high-quality pseudo OOD\&IND
labels. Extensive experiments and analysis demonstrate the effectiveness of our
method for few-shot OOD detection
Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition
Meta-learning methods have been widely used in few-shot named entity
recognition (NER), especially prototype-based methods. However, the Other(O)
class is difficult to be represented by a prototype vector because there are
generally a large number of samples in the class that have miscellaneous
semantics. To solve the problem, we propose MeTNet, which generates prototype
vectors for entity types only but not O-class. We design an improved triplet
network to map samples and prototype vectors into a low-dimensional space that
is easier to be classified and propose an adaptive margin for each entity type.
The margin plays as a radius and controls a region with adaptive size in the
low-dimensional space. Based on the regions, we propose a new inference
procedure to predict the label of a query instance. We conduct extensive
experiments in both in-domain and cross-domain settings to show the superiority
of MeTNet over other state-of-the-art methods. In particular, we release a
Chinese few-shot NER dataset FEW-COMM extracted from a well-known e-commerce
platform. To the best of our knowledge, this is the first Chinese few-shot NER
dataset. All the datasets and codes are provided at
https://github.com/hccngu/MeTNet
- …