Search CORE

1,235 research outputs found

TART: Improved Few-shot Text Classification Using Task-Adaptive Reference Transformation

Author: Chen Fanglan
He Jianfeng
Lei Shuo
Lu Chang-Tien
Zhang Xuchao
Publication venue
Publication date: 03/06/2023
Field of study

Meta-learning has emerged as a trending technique to tackle few-shot text classification and achieve state-of-the-art performance. However, the performance of existing approaches heavily depends on the inter-class variance of the support set. As a result, it can perform well on tasks when the semantics of sampled classes are distinct while failing to differentiate classes with similar semantics. In this paper, we propose a novel Task-Adaptive Reference Transformation (TART) network, aiming to enhance the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. To further maximize divergence between transformed prototypes in task-adaptive metric spaces, TART introduces a discriminative reference regularization among transformed prototypes. Extensive experiments are conducted on four benchmark datasets and our method demonstrates clear superiority over the state-of-the-art models in all the datasets. In particular, our model surpasses the state-of-the-art method by 7.4% and 5.4% in 1-shot and 5-shot classification on the 20 Newsgroups dataset, respectively.Comment: 11 pages, 5 figures. Accepted by ACL 202

arXiv.org e-Print Archive

Adaptive Meta-learner via Gradient Similarity for Few-shot Text Classification

Author: Hu Honghui
Lei Tianyi
Luo Qiaoyang
Peng Dezhong
Wang Xu
Publication venue
Publication date: 28/07/2023
Field of study

Few-shot text classification aims to classify the text under the few-shot scenario. Most of the previous methods adopt optimization-based meta learning to obtain task distribution. However, due to the neglect of matching between the few amount of samples and complicated models, as well as the distinction between useful and useless task features, these methods suffer from the overfitting issue. To address this issue, we propose a novel Adaptive Meta-learner via Gradient Similarity (AMGS) method to improve the model generalization ability to a new task. Specifically, the proposed AMGS alleviates the overfitting based on two aspects: (i) acquiring the potential semantic representation of samples and improving model generalization through the self-supervised auxiliary task in the inner loop, (ii) leveraging the adaptive meta-learner via gradient similarity to add constraints on the gradient obtained by base-learner in the outer loop. Moreover, we make a systematic analysis of the influence of regularization on the entire framework. Experimental results on several benchmarks demonstrate that the proposed AMGS consistently improves few-shot text classification performance compared with the state-of-the-art optimization-based meta-learning approaches.Comment: COLING 202

arXiv.org e-Print Archive

Retrieval-Augmented Meta Learning for Low-Resource Text Classification

Author: Li Rongsheng
Li Yangning
Li Yinghui
Luoyiching Chaiyut
Su Hanjing
Zheng Hai-Tao
Zhou Nannan
Publication venue
Publication date: 10/09/2023
Field of study

Meta learning have achieved promising performance in low-resource text classification which aims to identify target classes with knowledge transferred from source classes with sets of small tasks named episodes. However, due to the limited training data in the meta-learning scenario and the inherent properties of parameterized neural networks, poor generalization performance has become a pressing problem that needs to be addressed. To deal with this issue, we propose a meta-learning based method called Retrieval-Augmented Meta Learning(RAML). It not only uses parameterization for inference but also retrieves non-parametric knowledge from an external corpus to make inferences, which greatly alleviates the problem of poor generalization performance caused by the lack of diverse training data in meta-learning. This method differs from previous models that solely rely on parameters, as it explicitly emphasizes the importance of non-parametric knowledge, aiming to strike a balance between parameterized neural networks and non-parametric knowledge. The model is required to determine which knowledge to access and utilize during inference. Additionally, our multi-view passages fusion network module can effectively and efficiently integrate the retrieved information into low-resource classification task. The extensive experiments demonstrate that RAML significantly outperforms current SOTA low-resource text classification models.Comment: Under Revie

arXiv.org e-Print Archive

Integrating Across Conceptual Spaces

Author: Aho Kaarina Terese
Publication venue: UCL (University College London)
Publication date: 28/10/2023
Field of study

It has been shown that structure is shared across multiple modalities in the real world: if we speak about two items in similar ways, then they are also likely to appear in similar visual contexts. Such similarity relationships are recapitulated across modalities for entire systems of concepts. This provides a signal that can be used to identify the correct mapping between modalities without relying on event-based learning, by a process of systems alignment. Because it depends on relationships within a modality, systems alignment can operate asynchronously, meaning that learning may not require direct labelling events (e.g., seeing a truck and hearing someone say the word ‘truck’). Instead, learning can occur based on linguistic and visual information which is received at different points in time (e.g., having overheard a conversation about trucks, and seeing one on the road the next day). This thesis explores the value of alignment in learning to integrate between conceptual systems. It takes a joint experimental and computational approach, which simultaneously facilitates insights on alignment processes in controlled environments and at scale. The role of alignment in learning is explored from three perspectives, yielding three distinct contributions. In Chapter 2, signatures of alignment are identified in a real-world setting: children’s early concept learning. Moving to a controlled experimental setting, Chapter 3 demonstrates that humans benefit from alignment signals in cross-system learning, and finds that models which attempt the asynchronous alignment of systems best capture human behaviour. Chapter 4 implements these insights in machine-learning systems, using alignment to tackle cross-modal learning problems at scale. Alignment processes prove valuable to human learning across conceptual systems, providing a fresh perspective on learning that complements prevailing event-based accounts. This research opens doors for machine learning systems to harness alignment mechanisms for cross-modal learning, thus reducing their reliance on extensive supervision by drawing inspiration from both human learning and the structure of the environment

UCL Discovery