89,312 research outputs found
Cross-Domain Neural Entity Linking
Entity Linking is the task of matching a mention to an entity in a given
knowledge base (KB). It contributes to annotating a massive amount of documents
existing on the Web to harness new facts about their matched entities. However,
existing Entity Linking systems focus on developing models that are typically
domain-dependent and robust only to a particular knowledge base on which they
have been trained. The performance is not as adequate when being evaluated on
documents and knowledge bases from different domains.
Approaches based on pre-trained language models, such as Wu et al. (2020),
attempt to solve the problem using a zero-shot setup, illustrating some
potential when evaluated on a general-domain KB. Nevertheless, the performance
is not equivalent when evaluated on a domain-specific KB. To allow for more
accurate Entity Linking across different domains, we propose our framework:
Cross-Domain Neural Entity Linking (CDNEL). Our objective is to have a single
system that enables simultaneous linking to both the general-domain KB and the
domain-specific KB. CDNEL works by learning a joint representation space for
these knowledge bases from different domains. It is evaluated using the
external Entity Linking dataset (Zeshel) constructed by Logeswaran et al.
(2019) and the Reddit dataset collected by Botzer et al. (2021), to compare our
proposed method with the state-of-the-art results. The proposed framework uses
different types of datasets for fine-tuning, resulting in different model
variants of CDNEL. When evaluated on four domains included in the Zeshel
dataset, these variants achieve an average precision gain of 9%.Comment: Master's thesis, 76 pages, 26 figure
The Second Cross-Lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic Languages
We describe the Second Multilingual Named Entity Challenge in Slavic languages. The task is recognizing mentions of named entities in Web documents, their normalization, and cross-lingual linking The Challenge was organized as part of the 7th Balto-Slavic Natural Language Processing Workshop, co-located with the ACL-2019 conference. Eight teams participated in the competition, which covered four languages and five entity types. Performance for the named entity recognition task reached 90% F-measure, much higher than reported in the first edition of the Challenge. Seven teams covered all four languages, and five teams participated in the cross-lingual entity linking task. Detailed evaluation information is available on the shared task web page.Non peer reviewe
Joint Representation Learning of Cross-lingual Words and Entities via Attentive Distant Supervision
Joint representation learning of words and entities benefits many NLP tasks,
but has not been well explored in cross-lingual settings. In this paper, we
propose a novel method for joint representation learning of cross-lingual words
and entities. It captures mutually complementary knowledge, and enables
cross-lingual inferences among knowledge bases and texts. Our method does not
require parallel corpora, and automatically generates comparable data via
distant supervision using multi-lingual knowledge bases. We utilize two types
of regularizers to align cross-lingual words and entities, and design knowledge
attention and cross-lingual attention to further reduce noises. We conducted a
series of experiments on three tasks: word translation, entity relatedness, and
cross-lingual entity linking. The results, both qualitatively and
quantitatively, demonstrate the significance of our method.Comment: 11 pages, EMNLP201
- …