Search CORE

6 research outputs found

Acta Cybernetica : Volume 20. Number 3.

Author
Publication venue
Publication date: 01/01/2012
Field of study

Web Person Name Disambiguation Using Social Links and Enriched Profile Information

Author: Abdollahzadeh Barforoush Ahmad
Emami Hojjat
Shirazi Hossein
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 04/02/2019
Field of study

In this article, we investigate the problem of cross-document person name disambiguation, which aimed at resolving ambiguities between person names and clustering web documents according to their association to different persons sharing the same name. The majority of previous work often formulated cross-document name disambiguation as a clustering problem. These methods employed various syntactic and semantic features either from the local corpus or distant knowledge bases to compute similarities between entities and group similar entities. However, these approaches show limitations regarding robustness and performance. We propose an unsupervised, graph-based name disambiguation approach to improve the performance and robustness of the state-of-the-art. Our approach exploits both local information extracted from the given corpus, and global information obtained from distant knowledge bases. We show the effectiveness of our approach by testing it on standard WePS datasets. The experimental results are encouraging and show that our proposed method outperforms several baseline methods and also its counterparts. The experiments show that our approach not only improves the performances, but also increases the robustness of name disambiguation

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

The role of knowledge in determining identity of long-tail entities

Author: Hovy Eduard
Ilievski Filip
Schlobach Stefan
Vossen Piek
Xie Qizhe
Publication venue: 'Elsevier BV'
Publication date: 01/03/2020
Field of study

The NIL entities do not have an accessible representation, which means that their identity cannot be established through traditional disambiguation. Consequently, they have received little attention in entity linking systems and tasks so far. Given the non-redundancy of knowledge on NIL entities, the lack of frequency priors, their potentially extreme ambiguity, and numerousness, they form an extreme class of long-tail entities and pose a great challenge for state-of-the-art systems. In this paper, we investigate the role of knowledge when establishing the identity of NIL entities mentioned in text. What kind of knowledge can be applied to establish the identity of NILs? Can we potentially link to them at a later point? How to capture implicit knowledge and fill knowledge gaps in communication? We formulate and test hypotheses to provide insights to these questions. Due to the unavailability of instance-level knowledge, we propose to enrich the locally extracted information with profiling models that rely on background knowledge in Wikidata. We describe and implement two profiling machines based on state-of-the-art neural models. We evaluate their intrinsic behavior and their impact on the task of determining identity of NIL entities

VU Research Portal

Person attribute extraction from the textual parts of web pages

Author: Nagy T. István
Publication venue
Publication date: 01/01/2012
Field of study

We present a web mining system that clusters persons sharing the same name and also extracts bibliographical information about them. The input of our system is the result of web search engine queries in English or in Hungarian. For system evaluation in English, our system (RGAI) participated in the third Web People Search Task challenge [1]. The chief characteristics of our approach compared to the others are that we focus on the raw textual parts of the web pages instead of the structured parts, we group similar attribute classes together and we explicitly handle their interdependencies. The RGAI system achieved top results on the person attribute extraction subtask, and average results on the person clustering subtask. Following the shared task annotation principles, we also manually constructed a Hungarian person disambiguation corpus and adapted our system from English to Hungarian. We present experimental results on this as well

Crossref

University of Szeged

Person Attribute Extraction from the Textual Parts of Web Pages

Author
Publication venue: 'University of Szeged'
Publication date: 01/01/2012
Field of study

Crossref