719 research outputs found
A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods
Multi-task learning (MTL) has become increasingly popular in natural language
processing (NLP) because it improves the performance of related tasks by
exploiting their commonalities and differences. Nevertheless, it is still not
understood very well how multi-task learning can be implemented based on the
relatedness of training tasks. In this survey, we review recent advances of
multi-task learning methods in NLP, with the aim of summarizing them into two
general multi-task training methods based on their task relatedness: (i) joint
training and (ii) multi-step training. We present examples in various NLP
downstream applications, summarize the task relationships and discuss future
directions of this promising topic.Comment: Accepted to EACL 2023 as regular long pape
PersoNER: Persian named-entity recognition
© 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network
Bringing order into the realm of Transformer-based language models for artificial intelligence and law
Transformer-based language models (TLMs) have widely been recognized to be a
cutting-edge technology for the successful development of deep-learning-based
solutions to problems and applications that require natural language processing
and understanding. Like for other textual domains, TLMs have indeed pushed the
state-of-the-art of AI approaches for many tasks of interest in the legal
domain. Despite the first Transformer model being proposed about six years ago,
there has been a rapid progress of this technology at an unprecedented rate,
whereby BERT and related models represent a major reference, also in the legal
domain. This article provides the first systematic overview of TLM-based
methods for AI-driven problems and tasks in the legal sphere. A major goal is
to highlight research advances in this field so as to understand, on the one
hand, how the Transformers have contributed to the success of AI in supporting
legal processes, and on the other hand, what are the current limitations and
opportunities for further research development.Comment: Please refer to the published version: Greco, C.M., Tagarelli, A.
(2023) Bringing order into the realm of Transformer-based language models for
artificial intelligence and law. Artif Intell Law, Springer Nature. November
2023. https://doi.org/10.1007/s10506-023-09374-
Graph Neural Networks for Natural Language Processing: A Survey
Deep learning has become the dominant approach in coping with various tasks
in Natural LanguageProcessing (NLP). Although text inputs are typically
represented as a sequence of tokens, there isa rich variety of NLP problems
that can be best expressed with a graph structure. As a result, thereis a surge
of interests in developing new deep learning techniques on graphs for a large
numberof NLP tasks. In this survey, we present a comprehensive overview onGraph
Neural Networks(GNNs) for Natural Language Processing. We propose a new
taxonomy of GNNs for NLP, whichsystematically organizes existing research of
GNNs for NLP along three axes: graph construction,graph representation
learning, and graph based encoder-decoder models. We further introducea large
number of NLP applications that are exploiting the power of GNNs and summarize
thecorresponding benchmark datasets, evaluation metrics, and open-source codes.
Finally, we discussvarious outstanding challenges for making the full use of
GNNs for NLP as well as future researchdirections. To the best of our
knowledge, this is the first comprehensive overview of Graph NeuralNetworks for
Natural Language Processing.Comment: 127 page
A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4
Large language models (LLMs) are a special class of pretrained language
models obtained by scaling model size, pretraining corpus and computation.
LLMs, because of their large size and pretraining on large volumes of text
data, exhibit special abilities which allow them to achieve remarkable
performances without any task-specific training in many of the natural language
processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the
popularity of LLMs is increasing exponentially after the introduction of models
like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models,
including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With
the ever-rising popularity of GLLMs, especially in the research community,
there is a strong need for a comprehensive survey which summarizes the recent
research progress in multiple dimensions and can guide the research community
with insightful future research directions. We start the survey paper with
foundation concepts like transformers, transfer learning, self-supervised
learning, pretrained language models and large language models. We then present
a brief overview of GLLMs and discuss the performances of GLLMs in various
downstream tasks, specific domains and multiple languages. We also discuss the
data labelling and data augmentation abilities of GLLMs, the robustness of
GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with
multiple insightful future research directions. To summarize, this
comprehensive survey paper will serve as a good resource for both academic and
industry people to stay updated with the latest research related to GPT-3
family large language models.Comment: Preprint under review, 58 page
- …