1,032 research outputs found
MICK: A Meta-Learning Framework for Few-shot Relation Classification with Small Training Data
Few-shot relation classification seeks to classify incoming query instances
after meeting only few support instances. This ability is gained by training
with large amount of in-domain annotated data. In this paper, we tackle an even
harder problem by further limiting the amount of data available at training
time. We propose a few-shot learning framework for relation classification,
which is particularly powerful when the training data is very small. In this
framework, models not only strive to classify query instances, but also seek
underlying knowledge about the support instances to obtain better instance
representations. The framework also includes a method for aggregating
cross-domain knowledge into models by open-source task enrichment.
Additionally, we construct a brand new dataset: the TinyRel-CM dataset, a
few-shot relation classification dataset in health domain with purposely small
training data and challenging relation classes. Experimental results
demonstrate that our framework brings performance gains for most underlying
classification models, outperforms the state-of-the-art results given small
training data, and achieves competitive results with sufficiently large
training data
Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation
The success of deep learning methods hinges on the availability of large
training datasets annotated for the task of interest. In contrast to human
intelligence, these methods lack versatility and struggle to learn and adapt
quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve
this problem by training a model on a large number of few-shot tasks, with an
objective to learn new tasks quickly from a small number of examples. In this
paper, we propose a meta-learning framework for few-shot word sense
disambiguation (WSD), where the goal is to learn to disambiguate unseen words
from only a few labeled instances. Meta-learning approaches have so far been
typically tested in an -way, -shot classification setting where each task
has classes with examples per class. Owing to its nature, WSD deviates
from this controlled setup and requires the models to handle a large number of
highly unbalanced classes. We extend several popular meta-learning approaches
to this scenario, and analyze their strengths and weaknesses in this new
challenging setting.Comment: Added additional experiment
RAPS: A Novel Few-Shot Relation Extraction Pipeline with Query-Information Guided Attention and Adaptive Prototype Fusion
Few-shot relation extraction (FSRE) aims at recognizing unseen relations by
learning with merely a handful of annotated instances. To generalize to new
relations more effectively, this paper proposes a novel pipeline for the FSRE
task based on queRy-information guided Attention and adaptive Prototype fuSion,
namely RAPS. Specifically, RAPS first derives the relation prototype by the
query-information guided attention module, which exploits rich interactive
information between the support instances and the query instances, in order to
obtain more accurate initial prototype representations. Then RAPS elaborately
combines the derived initial prototype with the relation information by the
adaptive prototype fusion mechanism to get the integrated prototype for both
train and prediction. Experiments on the benchmark dataset FewRel 1.0 show a
significant improvement of our method against state-of-the-art methods.Comment: 9 pages, 2 figure
MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection
Event detection (ED) is aimed to identify the key trigger words in
unstructured text and predict the event types accordingly. Traditional ED
models are too data-hungry to accommodate real applications with scarce labeled
data. Besides, typical ED models are facing the context-bypassing and disabled
generalization issues caused by the trigger bias stemming from ED datasets.
Therefore, we focus on the true few-shot paradigm to satisfy the low-resource
scenarios. In particular, we propose a multi-step prompt learning model
(MsPrompt) for debiasing few-shot event detection, that consists of the
following three components: an under-sampling module targeting to construct a
novel training set that accommodates the true few-shot setting, a multi-step
prompt module equipped with a knowledge-enhanced ontology to leverage the event
semantics and latent prior knowledge in the PLMs sufficiently for tackling the
context-bypassing problem, and a prototypical module compensating for the
weakness of classifying events with sparse data and boost the generalization
performance. Experiments on two public datasets ACE-2005 and FewEvent show that
MsPrompt can outperform the state-of-the-art models, especially in the strict
low-resource scenarios reporting 11.43% improvement in terms of weighted
F1-score against the best-performing baseline and achieving an outstanding
debiasing performance
A Survey on Recent Named Entity Recognition and Relation Classification Methods with Focus on Few-Shot Learning Approaches
Named entity recognition and relation classification are key stages for
extracting information from unstructured text. Several natural language
processing applications utilize the two tasks, such as information retrieval,
knowledge graph construction and completion, question answering and other
domain-specific applications, such as biomedical data mining. We present a
survey of recent approaches in the two tasks with focus on few-shot learning
approaches. Our work compares the main approaches followed in the two
paradigms. Additionally, we report the latest metric scores in the two tasks
with a structured analysis that considers the results in the few-shot learning
scope
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation
A critical challenge faced by supervised word sense disambiguation (WSD) is
the lack of large annotated datasets with sufficient coverage of words in their
diversity of senses. This inspired recent research on few-shot WSD using
meta-learning. While such work has successfully applied meta-learning to learn
new word senses from very few examples, its performance still lags behind its
fully supervised counterpart. Aiming to further close this gap, we propose a
model of semantic memory for WSD in a meta-learning setting. Semantic memory
encapsulates prior experiences seen throughout the lifetime of the model, which
aids better generalization in limited data settings. Our model is based on
hierarchical variational inference and incorporates an adaptive memory update
rule via a hypernetwork. We show our model advances the state of the art in
few-shot WSD, supports effective learning in extremely data scarce (e.g.
one-shot) scenarios and produces meaning prototypes that capture similar senses
of distinct words.Comment: 15 pages, 5 figure
- …