1,108 research outputs found
Distant Supervision for Entity Linking
Entity linking is an indispensable operation of populating knowledge
repositories for information extraction. It studies on aligning a textual
entity mention to its corresponding disambiguated entry in a knowledge
repository. In this paper, we propose a new paradigm named distantly supervised
entity linking (DSEL), in the sense that the disambiguated entities that belong
to a huge knowledge repository (Freebase) are automatically aligned to the
corresponding descriptive webpages (Wiki pages). In this way, a large scale of
weakly labeled data can be generated without manual annotation and fed to a
classifier for linking more newly discovered entities. Compared with
traditional paradigms based on solo knowledge base, DSEL benefits more via
jointly leveraging the respective advantages of Freebase and Wikipedia.
Specifically, the proposed paradigm facilitates bridging the disambiguated
labels (Freebase) of entities and their textual descriptions (Wikipedia) for
Web-scale entities. Experiments conducted on a dataset of 140,000 items and
60,000 features achieve a baseline F1-measure of 0.517. Furthermore, we analyze
the feature performance and improve the F1-measure to 0.545
Topic modeling for entity linking using keyphrase
This paper proposes an Entity Linking system that applies a topic modeling ranking. We apply a novel approach in order to provide new relevant elements to the model. These elements are keyphrases related to the queries and gathered from a huge Wikipedia-based knowledge resourcePeer ReviewedPostprint (author’s final draft
Toponym extraction and disambiguation enhancement using loops of feedback
Toponym extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with toponym extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. In this paper we aim to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms. We show that the extraction confidence probabilities are useful in enhancing the effectiveness of disambiguation. Reciprocally, retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several automatic iterations
Learning Relatedness Measures for Entity Linking
Entity Linking is the task of detecting, in text documents, relevant mentions to entities of a given knowledge base. To this end, entity-linking algorithms use several signals and features extracted from the input text or from the knowl- edge base. The most important of such features is entity relatedness. Indeed, we argue that these algorithms benefit from maximizing the relatedness among the relevant enti- ties selected for annotation, since this minimizes errors in disambiguating entity-linking.
The definition of an e↵ective relatedness function is thus a crucial point in any entity-linking algorithm. In this paper we address the problem of learning high-quality entity relatedness functions. First, we formalize the problem of learning entity relatedness as a learning-to-rank problem. We propose a methodology to create reference datasets on the basis of manually annotated data. Finally, we show that our machine-learned entity relatedness function performs better than other relatedness functions previously proposed, and, more importantly, improves the overall performance of dif- ferent state-of-the-art entity-linking algorithms
- …