Search CORE

855 research outputs found

A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods

Author: Guo Zhichun
Jiang Meng
Yu Mengxia
Yu Wenhao
Zhang Zhihan
Publication venue
Publication date: 14/02/2023
Field of study

Multi-task learning (MTL) has become increasingly popular in natural language processing (NLP) because it improves the performance of related tasks by exploiting their commonalities and differences. Nevertheless, it is still not understood very well how multi-task learning can be implemented based on the relatedness of training tasks. In this survey, we review recent advances of multi-task learning methods in NLP, with the aim of summarizing them into two general multi-task training methods based on their task relatedness: (i) joint training and (ii) multi-step training. We present examples in various NLP downstream applications, summarize the task relationships and discuss future directions of this promising topic.Comment: Accepted to EACL 2023 as regular long pape

arXiv.org e-Print Archive

Adaptive End-to-End Metric Learning for Zero-Shot Cross-Domain Slot Filling

Author: Shao Minglai
Shi Yuanjun
Wu Linzhi
Publication venue
Publication date: 23/10/2023
Field of study

Recently slot filling has witnessed great development thanks to deep learning and the availability of large-scale annotated data. However, it poses a critical challenge to handle a novel domain whose samples are never seen during training. The recognition performance might be greatly degraded due to severe domain shifts. Most prior works deal with this problem in a two-pass pipeline manner based on metric learning. In practice, these dominant pipeline models may be limited in computational efficiency and generalization capacity because of non-parallel inference and context-free discrete label embeddings. To this end, we re-examine the typical metric-based methods, and propose a new adaptive end-to-end metric learning scheme for the challenging zero-shot slot filling. Considering simplicity, efficiency and generalizability, we present a cascade-style joint learning framework coupled with context-aware soft label representations and slot-level contrastive representation learning to mitigate the data and label shift problems effectively. Extensive experiments on public benchmarks demonstrate the superiority of the proposed approach over a series of competitive baselines.Comment: Accepted to EMNLP 2023 (Main, Long Paper

arXiv.org e-Print Archive

KILT: a Benchmark for Knowledge Intensive Language Tasks

Author: De Cao Nicola
Fan Angela
Jernite Yacine
Karpukhin Vladimir
Lewis Patrick
Maillard Jean
Petroni Fabio
Piktus Aleksandra
Plachouras Vassilis
Riedel Sebastian
Rocktäschel Tim
Thorne James
Yazdani Majid
Publication venue
Publication date: 12/04/2021
Field of study

Challenging problems such as open-domain question answering, fact checking, slot filling and entity linking require access to large, external knowledge sources. While some models do well on individual tasks, developing general models is difficult as each task might require computationally expensive indexing of custom knowledge sources, in addition to dedicated infrastructure. To catalyze research on models that condition on specific information in large textual resources, we present a benchmark for knowledge-intensive language tasks (KILT). All tasks in KILT are grounded in the same snapshot of Wikipedia, reducing engineering turnaround through the re-use of components, as well as accelerating research into task-agnostic memory architectures. We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text. KILT data and code are available at https://github.com/facebookresearch/KILT.Comment: accepted at NAACL 202

arXiv.org e-Print Archive

UCL Discovery

A multi-task BERT model for schema-guided dialogue state tracking

Author: Kapelonis Eleftherios
Καπελώνης Ελευθέριος
Publication venue
Publication date: 10/10/2022
Field of study

DSpace at NTUA