3 research outputs found
Learning to Select from Multiple Options
Many NLP tasks can be regarded as a selection problem from a set of options,
such as classification tasks, multi-choice question answering, etc. Textual
entailment (TE) has been shown as the state-of-the-art (SOTA) approach to
dealing with those selection problems. TE treats input texts as premises (P),
options as hypotheses (H), then handles the selection problem by modeling (P,
H) pairwise. Two limitations: first, the pairwise modeling is unaware of other
options, which is less intuitive since humans often determine the best options
by comparing competing candidates; second, the inference process of pairwise TE
is time-consuming, especially when the option space is large. To deal with the
two issues, this work first proposes a contextualized TE model (Context-TE) by
appending other k options as the context of the current (P, H) modeling.
Context-TE is able to learn more reliable decision for the H since it considers
various context. Second, we speed up Context-TE by coming up with Parallel-TE,
which learns the decisions of multiple options simultaneously. Parallel-TE
significantly improves the inference speed while keeping comparable performance
with Context-TE. Our methods are evaluated on three tasks (ultra-fine entity
typing, intent detection and multi-choice QA) that are typical selection
problems with different sizes of options. Experiments show our models set new
SOTA performance; particularly, Parallel-TE is faster than the pairwise TE by k
times in inference. Our code is publicly available at
https://github.com/jiangshdd/LearningToSelect.Comment: Accepted by AAAI 202
All Labels Together: Low-shot Intent Detection with an Efficient Label Semantic Encoding Paradigm
In intent detection tasks, leveraging meaningful semantic information from
intent labels can be particularly beneficial for few-shot scenarios. However,
existing few-shot intent detection methods either ignore the intent labels,
(e.g. treating intents as indices) or do not fully utilize this information
(e.g. only using part of the intent labels). In this work, we present an
end-to-end One-to-All system that enables the comparison of an input utterance
with all label candidates. The system can then fully utilize label semantics in
this way. Experiments on three few-shot intent detection tasks demonstrate that
One-to-All is especially effective when the training resource is extremely
scarce, achieving state-of-the-art performance in 1-, 3- and 5-shot settings.
Moreover, we present a novel pretraining strategy for our model that utilizes
indirect supervision from paraphrasing, enabling zero-shot cross-domain
generalization on intent detection tasks. Our code is at
https://github.com/jiangshdd/AllLablesTogether.Comment: Accepted by IJCNLP-AACL 202
Learning to Select from Multiple Options
Many NLP tasks can be regarded as a selection problem from a set of options, such as classification tasks, multi-choice question answering, etc. Textual entailment (TE) has been shown as the state-of-the-art (SOTA) approach to dealing with those selection problems. TE treats input texts as premises (P), options as hypotheses (H), then handles the selection problem by modeling (P, H) pairwise. Two limitations: first, the pairwise modeling is unaware of other options, which is less intuitive since humans often determine the best options by comparing competing candidates; second, the inference process of pairwise TE is time-consuming, especially when the option space is large. To deal with the two issues, this work first proposes a contextualized TE model (Context-TE) by appending other k options as the context of the current (P, H) modeling. Context-TE is able to learn more reliable decision for the H since it considers various context. Second, we speed up Context-TE by coming up with Parallel-TE, which learns the decisions of multiple options simultaneously. Parallel-TE significantly improves the inference speed while keeping comparable performance with Context-TE. Our methods are evaluated on three tasks (ultra-fine entity typing, intent detection and multi-choice QA) that are typical selection problems with different sizes of options. Experiments show our models set new SOTA performance; particularly, Parallel-TE is faster than the pairwise TE by k times in inference