8 research outputs found
Boosting Entity Linking Performance by Leveraging Unlabeled Documents
Modern entity linking systems rely on large collections of documents
specifically annotated for the task (e.g., AIDA CoNLL). In contrast, we propose
an approach which exploits only naturally occurring information: unlabeled
documents and Wikipedia. Our approach consists of two stages. First, we
construct a high recall list of candidate entities for each mention in an
unlabeled document. Second, we use the candidate lists as weak supervision to
constrain our document-level entity linking model. The model treats entities as
latent variables and, when estimated on a collection of unlabelled texts,
learns to choose entities relying both on local context of each mention and on
coherence with other entities in the document. The resulting approach rivals
fully-supervised state-of-the-art systems on standard test sets. It also
approaches their performance in the very challenging setting: when tested on a
test set sampled from the data used to estimate the supervised systems. By
comparing to Wikipedia-only training of our model, we demonstrate that modeling
unlabeled documents is beneficial.Comment: ACL201
Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation Models
Pretrained Transformer models have emerged as state-of-the-art approaches
that learn contextual information from text to improve the performance of
several NLP tasks. These models, albeit powerful, still require specialized
knowledge in specific scenarios. In this paper, we argue that context derived
from a knowledge graph (in our case: Wikidata) provides enough signals to
inform pretrained transformer models and improve their performance for named
entity disambiguation (NED) on Wikidata KG. We further hypothesize that our
proposed KG context can be standardized for Wikipedia, and we evaluate the
impact of KG context on state-of-the-art NED model for the Wikipedia knowledge
base. Our empirical results validate that the proposed KG context can be
generalized (for Wikipedia), and providing KG context in transformer
architectures considerably outperforms the existing baselines, including the
vanilla transformer models.Comment: to appear in proceedings of CIKM 202
Improving Entity Linking through Semantic Reinforced Entity Embeddings
Entity embeddings, which represent different aspects of each entity with a
single vector like word embeddings, are a key component of neural entity
linking models. Existing entity embeddings are learned from canonical Wikipedia
articles and local contexts surrounding target entities. Such entity embeddings
are effective, but too distinctive for linking models to learn contextual
commonality. We propose a simple yet effective method, FGS2EE, to inject
fine-grained semantic information into entity embeddings to reduce the
distinctiveness and facilitate the learning of contextual commonality. FGS2EE
first uses the embeddings of semantic type words to generate semantic
embeddings, and then combines them with existing entity embeddings through
linear aggregation. Extensive experiments show the effectiveness of such
embeddings. Based on our entity embeddings, we achieved new sate-of-the-art
performance on entity linking.Comment: 6 pages, 3 figures, ACL 202
Fine-Grained Entity Typing for Domain Independent Entity Linking
Neural entity linking models are very powerful, but run the risk of
overfitting to the domain they are trained in. For this problem, a domain is
characterized not just by genre of text but even by factors as specific as the
particular distribution of entities, as neural models tend to overfit by
memorizing properties of frequent entities in a dataset. We tackle the problem
of building robust entity linking models that generalize effectively and do not
rely on labeled entity linking data with a specific entity distribution. Rather
than predicting entities directly, our approach models fine-grained entity
properties, which can help disambiguate between even closely related entities.
We derive a large inventory of types (tens of thousands) from Wikipedia
categories, and use hyperlinked mentions in Wikipedia to distantly label data
and train an entity typing model. At test time, we classify a mention with this
typing model and use soft type predictions to link the mention to the most
similar candidate entity. We evaluate our entity linking system on the
CoNLL-YAGO dataset (Hoffart et al., 2011) and show that our approach
outperforms prior domain-independent entity linking systems. We also test our
approach in a harder setting derived from the WikilinksNED dataset (Eshel et
al., 2017) where all the mention-entity pairs are unseen during test time.
Results indicate that our approach generalizes better than a state-of-the-art
neural model on the dataset.Comment: AAAI 202
Autoregressive Entity Retrieval
Entities are at the center of how we represent and aggregate knowledge. For
instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one
per Wikipedia article). The ability to retrieve such entities given a query is
fundamental for knowledge-intensive tasks such as entity linking and
open-domain question answering. Current approaches can be understood as
classifiers among atomic labels, one for each entity. Their weight vectors are
dense entity representations produced by encoding entity meta information such
as their descriptions. This approach has several shortcomings: (i) context and
entity affinity is mainly captured through a vector dot product, potentially
missing fine-grained interactions; (ii) a large memory footprint is needed to
store dense representations when considering large entity sets; (iii) an
appropriately hard set of negative data has to be subsampled at training time.
In this work, we propose GENRE, the first system that retrieves entities by
generating their unique names, left to right, token-by-token in an
autoregressive fashion. This mitigates the aforementioned technical issues
since: (i) the autoregressive formulation directly captures relations between
context and entity name, effectively cross encoding both; (ii) the memory
footprint is greatly reduced because the parameters of our encoder-decoder
architecture scale with vocabulary size, not entity count; (iii) the softmax
loss is computed without subsampling negative data. We experiment with more
than 20 datasets on entity disambiguation, end-to-end entity linking and
document retrieval tasks, achieving new state-of-the-art or very competitive
results while using a tiny fraction of the memory footprint of competing
systems. Finally, we demonstrate that new entities can be added by simply
specifying their names. Code and pre-trained models at
https://github.com/facebookresearch/GENRE.Comment: Accepted (spotlight) at International Conference on Learning
Representations (ICLR) 2021. Code at
https://github.com/facebookresearch/GENRE. 20 pages, 9 figures, 8 table