8,157 research outputs found
Bipartite Flat-Graph Network for Nested Named Entity Recognition
In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for
nested named entity recognition (NER), which contains two subgraph modules: a
flat NER module for outermost entities and a graph module for all the entities
located in inner layers. Bidirectional LSTM (BiLSTM) and graph convolutional
network (GCN) are adopted to jointly learn flat entities and their inner
dependencies. Different from previous models, which only consider the
unidirectional delivery of information from innermost layers to outer ones (or
outside-to-inside), our model effectively captures the bidirectional
interaction between them. We first use the entities recognized by the flat NER
module to construct an entity graph, which is fed to the next graph module. The
richer representation learned from graph module carries the dependencies of
inner entities and can be exploited to improve outermost entity predictions.
Experimental results on three standard nested NER datasets demonstrate that our
BiFlaG outperforms previous state-of-the-art models.Comment: Accepted by ACL202
S2F-NER: Exploring Sequence-to-Forest Generation for Complex Entity Recognition
Named Entity Recognition (NER) remains challenging due to the complex
entities, like nested, overlapping, and discontinuous entities. Existing
approaches, such as sequence-to-sequence (Seq2Seq) generation and span-based
classification, have shown impressive performance on various NER subtasks, but
they are difficult to scale to datasets with longer input text because of
either exposure bias issue or inefficient computation. In this paper, we
propose a novel Sequence-to-Forest generation paradigm, S2F-NER, which can
directly extract entities in sentence via a Forest decoder that decode multiple
entities in parallel rather than sequentially. Specifically, our model generate
each path of each tree in forest autoregressively, where the maximum depth of
each tree is three (which is the shortest feasible length for complex NER and
is far smaller than the decoding length of Seq2Seq). Based on this novel
paradigm, our model can elegantly mitigates the exposure bias problem and keep
the simplicity of Seq2Seq. Experimental results show that our model
significantly outperforms the baselines on three discontinuous NER datasets and
on two nested NER datasets, especially for discontinuous entity recognition
- …