1,235 research outputs found
TART: Improved Few-shot Text Classification Using Task-Adaptive Reference Transformation
Meta-learning has emerged as a trending technique to tackle few-shot text
classification and achieve state-of-the-art performance. However, the
performance of existing approaches heavily depends on the inter-class variance
of the support set. As a result, it can perform well on tasks when the
semantics of sampled classes are distinct while failing to differentiate
classes with similar semantics. In this paper, we propose a novel Task-Adaptive
Reference Transformation (TART) network, aiming to enhance the generalization
by transforming the class prototypes to per-class fixed reference points in
task-adaptive metric spaces. To further maximize divergence between transformed
prototypes in task-adaptive metric spaces, TART introduces a discriminative
reference regularization among transformed prototypes. Extensive experiments
are conducted on four benchmark datasets and our method demonstrates clear
superiority over the state-of-the-art models in all the datasets. In
particular, our model surpasses the state-of-the-art method by 7.4% and 5.4% in
1-shot and 5-shot classification on the 20 Newsgroups dataset, respectively.Comment: 11 pages, 5 figures. Accepted by ACL 202
Adaptive Meta-learner via Gradient Similarity for Few-shot Text Classification
Few-shot text classification aims to classify the text under the few-shot
scenario. Most of the previous methods adopt optimization-based meta learning
to obtain task distribution. However, due to the neglect of matching between
the few amount of samples and complicated models, as well as the distinction
between useful and useless task features, these methods suffer from the
overfitting issue. To address this issue, we propose a novel Adaptive
Meta-learner via Gradient Similarity (AMGS) method to improve the model
generalization ability to a new task. Specifically, the proposed AMGS
alleviates the overfitting based on two aspects: (i) acquiring the potential
semantic representation of samples and improving model generalization through
the self-supervised auxiliary task in the inner loop, (ii) leveraging the
adaptive meta-learner via gradient similarity to add constraints on the
gradient obtained by base-learner in the outer loop. Moreover, we make a
systematic analysis of the influence of regularization on the entire framework.
Experimental results on several benchmarks demonstrate that the proposed AMGS
consistently improves few-shot text classification performance compared with
the state-of-the-art optimization-based meta-learning approaches.Comment: COLING 202
Retrieval-Augmented Meta Learning for Low-Resource Text Classification
Meta learning have achieved promising performance in low-resource text
classification which aims to identify target classes with knowledge transferred
from source classes with sets of small tasks named episodes. However, due to
the limited training data in the meta-learning scenario and the inherent
properties of parameterized neural networks, poor generalization performance
has become a pressing problem that needs to be addressed. To deal with this
issue, we propose a meta-learning based method called Retrieval-Augmented Meta
Learning(RAML). It not only uses parameterization for inference but also
retrieves non-parametric knowledge from an external corpus to make inferences,
which greatly alleviates the problem of poor generalization performance caused
by the lack of diverse training data in meta-learning. This method differs from
previous models that solely rely on parameters, as it explicitly emphasizes the
importance of non-parametric knowledge, aiming to strike a balance between
parameterized neural networks and non-parametric knowledge. The model is
required to determine which knowledge to access and utilize during inference.
Additionally, our multi-view passages fusion network module can effectively and
efficiently integrate the retrieved information into low-resource
classification task. The extensive experiments demonstrate that RAML
significantly outperforms current SOTA low-resource text classification models.Comment: Under Revie
Integrating Across Conceptual Spaces
It has been shown that structure is shared across multiple modalities in the real world: if we speak about two items in similar ways, then they are also likely to appear in similar visual contexts. Such similarity relationships are recapitulated across modalities for entire systems of concepts. This provides a signal that can be used to identify the correct mapping between modalities without relying on event-based learning, by a process of systems alignment. Because it depends on relationships within a modality, systems alignment can operate asynchronously, meaning that learning may not require direct labelling events (e.g., seeing a truck and hearing someone say the word ‘truck’). Instead, learning can occur based on linguistic and visual information which is received at different points in time (e.g., having overheard a conversation about trucks, and seeing one on the road the next day).
This thesis explores the value of alignment in learning to integrate between conceptual systems. It takes a joint experimental and computational approach, which simultaneously facilitates insights on alignment processes in controlled environments and at scale.
The role of alignment in learning is explored from three perspectives, yielding three distinct contributions. In Chapter 2, signatures of alignment are identified in a real-world setting: children’s early concept learning. Moving to a controlled experimental setting, Chapter 3 demonstrates that humans benefit from alignment signals in cross-system learning, and finds that models which attempt the asynchronous alignment of systems best capture human behaviour. Chapter 4 implements these insights in machine-learning systems, using alignment to tackle cross-modal learning problems at scale.
Alignment processes prove valuable to human learning across conceptual systems, providing a fresh perspective on learning that complements prevailing event-based accounts. This research opens doors for machine learning systems to harness alignment mechanisms for cross-modal learning, thus reducing their reliance on extensive supervision by drawing inspiration from both human learning and the structure of the environment
- …