2,291 research outputs found

    The role of knowledge in determining identity of long-tail entities

    Get PDF
    The NIL entities do not have an accessible representation, which means that their identity cannot be established through traditional disambiguation. Consequently, they have received little attention in entity linking systems and tasks so far. Given the non-redundancy of knowledge on NIL entities, the lack of frequency priors, their potentially extreme ambiguity, and numerousness, they form an extreme class of long-tail entities and pose a great challenge for state-of-the-art systems. In this paper, we investigate the role of knowledge when establishing the identity of NIL entities mentioned in text. What kind of knowledge can be applied to establish the identity of NILs? Can we potentially link to them at a later point? How to capture implicit knowledge and fill knowledge gaps in communication? We formulate and test hypotheses to provide insights to these questions. Due to the unavailability of instance-level knowledge, we propose to enrich the locally extracted information with profiling models that rely on background knowledge in Wikidata. We describe and implement two profiling machines based on state-of-the-art neural models. We evaluate their intrinsic behavior and their impact on the task of determining identity of NIL entities

    Semantic Enrichment of a Multilingual Archive with Linked Open Data

    Get PDF
    This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.Peer reviewe

    Linking named entities to Wikipedia

    Get PDF
    Natural language is fraught with problems of ambiguity, including name reference. A name in text can refer to multiple entities just as an entity can be known by different names. This thesis examines how a mention in text can be linked to an external knowledge base (KB), in our case, Wikipedia. The named entity linking (NEL) task requires systems to identify the KB entry, or Wikipedia article, that a mention refers to; or, if the KB does not contain the correct entry, return NIL. Entity linking systems can be complex and we present a framework for analysing their different components, which we use to analyse three seminal systems which are evaluated on a common dataset and we show the importance of precise search for linking. The Text Analysis Conference (TAC) is a major venue for NEL research. We report on our submissions to the entity linking shared task in 2010, 2011 and 2012. The information required to disambiguate entities is often found in the text, close to the mention. We explore apposition, a common way for authors to provide information about entities. We model syntactic and semantic restrictions with a joint model that achieves state-of-the-art apposition extraction performance. We generalise from apposition to examine local descriptions specified close to the mention. We add local description to our state-of-the-art linker by using patterns to extract the descriptions and matching against this restricted context. Not only does this make for a more precise match, we are also able to model failure to match. Local descriptions help disambiguate entities, further improving our state-of-the-art linker. The work in this thesis seeks to link textual entity mentions to knowledge bases. Linking is important for any task where external world knowledge is used and resolving ambiguity is fundamental to advancing research into these problems
    corecore