63 research outputs found
A Global Context Mechanism for Sequence Labeling
Sequential labeling tasks necessitate the computation of sentence
representations for each word within a given sentence. With the advent of
advanced pretrained language models; one common approach involves incorporating
a BiLSTM layer to bolster the sequence structure information at the output
level. Nevertheless, it has been empirically demonstrated (P.-H. Li et al.,
2020) that the potential of BiLSTM for generating sentence representations for
sequence labeling tasks is constrained, primarily due to the amalgamation of
fragments form past and future sentence representations to form a complete
sentence representation. In this study, we discovered that strategically
integrating the whole sentence representation, which existing in the first cell
and last cell of BiLSTM, into sentence representation of ecah cell, could
markedly enhance the F1 score and accuracy. Using BERT embedded within BiLSTM
as illustration, we conducted exhaustive experiments on nine datasets for
sequence labeling tasks, encompassing named entity recognition (NER), part of
speech (POS) tagging and End-to-End Aspect-Based sentiment analysis (E2E-ABSA).
We noted significant improvements in F1 scores and accuracy across all examined
datasets
Exploration of Approaches to Arabic Named Entity Recognition
Abstract. The Named Entity Recognition (NER) task has attracted significant attention in Natural Language Processing (NLP) as it can enhance the performance of many NLP applications. In this paper, we compare English NER with Arabic NER in an experimental way to investigate the impact of using different classifiers and sets of features including language-independent and language-specific features. We explore the features and classifiers on five different datasets. We compare deep neural network architectures for NER with more traditional machine learning approaches to NER. We discover that most of the techniques and features used for English NER perform well on Arabic NER. Our results highlight the improvements achieved by using language-specific features in Arabic NER
MINER: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective
NER model has achieved promising performance on standard NER benchmarks.
However, recent studies show that previous approaches may over-rely on entity
mention information, resulting in poor performance on out-of-vocabulary (OOV)
entity recognition. In this work, we propose MINER, a novel NER learning
framework, to remedy this issue from an information-theoretic perspective. The
proposed approach contains two mutual information-based training objectives: i)
generalizing information maximization, which enhances representation via deep
understanding of context and entity surface forms; ii) superfluous information
minimization, which discourages representation from rote memorizing entity
names or exploiting biased cues in data. Experiments on various settings and
datasets demonstrate that it achieves better performance in predicting OOV
entities
Open, Closed, or Small Language Models for Text Classification?
Recent advancements in large language models have demonstrated remarkable
capabilities across various NLP tasks. But many questions remain, including
whether open-source models match closed ones, why these models excel or
struggle with certain tasks, and what types of practical procedures can improve
performance. We address these questions in the context of classification by
evaluating three classes of models using eight datasets across three distinct
tasks: named entity recognition, political party prediction, and misinformation
detection. While larger LLMs often lead to improved performance, open-source
models can rival their closed-source counterparts by fine-tuning. Moreover,
supervised smaller models, like RoBERTa, can achieve similar or even greater
performance in many datasets compared to generative LLMs. On the other hand,
closed models maintain an advantage in hard tasks that demand the most
generalizability. This study underscores the importance of model selection
based on task requirementsComment: 14 pages, 15 Tables, 1 Figur
Win-Win Cooperation: Bundling Sequence and Span Models for Named Entity Recognition
For Named Entity Recognition (NER), sequence labeling-based and span-based
paradigms are quite different. Previous research has demonstrated that the two
paradigms have clear complementary advantages, but few models have attempted to
leverage these advantages in a single NER model as far as we know. In our
previous work, we proposed a paradigm known as Bundling Learning (BL) to
address the above problem. The BL paradigm bundles the two NER paradigms,
enabling NER models to jointly tune their parameters by weighted summing each
paradigm's training loss. However, three critical issues remain unresolved:
When does BL work? Why does BL work? Can BL enhance the existing
state-of-the-art (SOTA) NER models? To address the first two issues, we
implement three NER models, involving a sequence labeling-based model--SeqNER,
a span-based NER model--SpanNER, and BL-NER that bundles SeqNER and SpanNER
together. We draw two conclusions regarding the two issues based on the
experimental results on eleven NER datasets from five domains. We then apply BL
to five existing SOTA NER models to investigate the third issue, consisting of
three sequence labeling-based models and two span-based models. Experimental
results indicate that BL consistently enhances their performance, suggesting
that it is possible to construct a new SOTA NER system by incorporating BL into
the current SOTA system. Moreover, we find that BL reduces both entity boundary
and type prediction errors. In addition, we compare two commonly used labeling
tagging methods as well as three types of span semantic representations
- …