39,941 research outputs found
MeLinDa: an interlinking framework for the web of data
The web of data consists of data published on the web in such a way that they
can be interpreted and connected together. It is thus critical to establish
links between these data, both for the web of data and for the semantic web
that it contributes to feed. We consider here the various techniques developed
for that purpose and analyze their commonalities and differences. We propose a
general framework and show how the diverse techniques fit in the framework.
From this framework we consider the relation between data interlinking and
ontology matching. Although, they can be considered similar at a certain level
(they both relate formal entities), they serve different purposes, but would
find a mutual benefit at collaborating. We thus present a scheme under which it
is possible for data linking tools to take advantage of ontology alignments.Comment: N° RR-7691 (2011
Probabilistic Bag-Of-Hyperlinks Model for Entity Linking
Many fundamental problems in natural language processing rely on determining
what entities appear in a given text. Commonly referenced as entity linking,
this step is a fundamental component of many NLP tasks such as text
understanding, automatic summarization, semantic search or machine translation.
Name ambiguity, word polysemy, context dependencies and a heavy-tailed
distribution of entities contribute to the complexity of this problem.
We here propose a probabilistic approach that makes use of an effective
graphical model to perform collective entity disambiguation. Input mentions
(i.e.,~linkable token spans) are disambiguated jointly across an entire
document by combining a document-level prior of entity co-occurrences with
local information captured from mentions and their surrounding context. The
model is based on simple sufficient statistics extracted from data, thus
relying on few parameters to be learned.
Our method does not require extensive feature engineering, nor an expensive
training procedure. We use loopy belief propagation to perform approximate
inference. The low complexity of our model makes this step sufficiently fast
for real-time usage. We demonstrate the accuracy of our approach on a wide
range of benchmark datasets, showing that it matches, and in many cases
outperforms, existing state-of-the-art methods
- …