Search CORE

3,529 research outputs found

Name Disambiguation from link data in a collaboration graph

Author: Al Hasan Mohammad
Saha Tanay Kumar
Zhang Baichuan
Publication venue: Office of the Vice Chancellor for Research
Publication date: 17/04/2015
Field of study

poster abstractAbstract—The entity disambiguation task partitions the records belonging to multiple persons with the objective that each decomposed partition is composed of records of a unique person. Existing solutions to this task use either biographical attributes, or auxiliary features that are collected from external sources, such as Wikipedia. However, for many scenarios, such auxiliary features are not available, or they are costly to obtain. Besides, the attempt of collecting biographical or external data sustains the risk of privacy violation. In this work, we propose a method for solving entity disambiguation task from link information obtained from a collaboration network. Our method is nonintrusive of privacy as it uses only the timestamped graph topology of an anonymized network. Experimental results on two reallife academic collaboration networks show that the proposed method has satisfactory performance

IUPUIScholarWorks

Name Disambiguation from link data in a collaboration graph using temporal and topological features

Author: Hasan Mohammad Al
Saha Tanay Kumar
Zhang Baichuan
Publication venue
Publication date: 01/12/2015
Field of study

In a social community, multiple persons may share the same name, phone number or some other identifying attributes. This, along with other phenomena, such as name abbreviation, name misspelling, and human error leads to erroneous aggregation of records of multiple persons under a single reference. Such mistakes affect the performance of document retrieval, web search, database integration, and more importantly, improper attribution of credit (or blame). The task of entity disambiguation partitions the records belonging to multiple persons with the objective that each decomposed partition is composed of records of a unique person. Existing solutions to this task use either biographical attributes, or auxiliary features that are collected from external sources, such as Wikipedia. However, for many scenarios, such auxiliary features are not available, or they are costly to obtain. Besides, the attempt of collecting biographical or external data sustains the risk of privacy violation. In this work, we propose a method for solving entity disambiguation task from link information obtained from a collaboration network. Our method is non-intrusive of privacy as it uses only the time-stamped graph topology of an anonymized network. Experimental results on two real-life academic collaboration networks show that the proposed method has satisfactory performance.Comment: The short version of this paper has been accepted to ASONAM 201

arXiv.org e-Print Archive

IUPUIScholarWorks

A graph-based disambiguation approach for construction of an expert repository from public online sources

Author: Buelens S
De Turck Filip
Hristoskova Anna
Putman M
Tourwé T
Tsiporkova E
Publication venue
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

Identifying Geographic Clusters: A Network Analytic Approach

Author: Catini Roberto
Karamshuk Dmytro
Penner Orion
Riccaboni Massimo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

In recent years there has been a growing interest in the role of networks and clusters in the global economy. Despite being a popular research topic in economics, sociology and urban studies, geographical clustering of human activity has often studied been by means of predetermined geographical units such as administrative divisions and metropolitan areas. This approach is intrinsically time invariant and it does not allow one to differentiate between different activities. Our goal in this paper is to present a new methodology for identifying clusters, that can be applied to different empirical settings. We use a graph approach based on k-shell decomposition to analyze world biomedical research clusters based on PubMed scientific publications. We identify research institutions and locate their activities in geographical clusters. Leading areas of scientific production and their top performing research institutions are consistently identified at different geographic scales

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Munich RePEc Personal Archive

Crossref

Archivio della ricerca della Scuola IMT Alti Studi Lucca

King's Research Portal

IMT Institutional Repository

Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams

Author: Bunescu R.
Chen P.-Y.
Davis A.
de Carvalho A. P.
Dundar M.
Lee D. D.
Michaud D. J.
Sethuraman J.
Zhang B.
Publication venue
Publication date: 01/09/2016
Field of study

The name entity disambiguation task aims to partition the records of multiple real-life persons so that each partition contains records pertaining to a unique person. Most of the existing solutions for this task operate in a batch mode, where all records to be disambiguated are initially available to the algorithm. However, more realistic settings require that the name disambiguation task be performed in an online fashion, in addition to, being able to identify records of new ambiguous entities having no preexisting records. In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task. Our proposed method uses a Dirichlet process prior with a Normal * Normal * Inverse Wishart data model which enables identification of new ambiguous entities who have no records in the training data. For online classification, we use one sweep Gibbs sampler which is very efficient and effective. As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups. Our experimental results demonstrate that the proposed method is better than existing methods for performing online name disambiguation task.Comment: to appear in CIKM 201

arXiv.org e-Print Archive

Crossref

IUPUIScholarWorks