The name entity disambiguation task aims to partition the records of multiple
real-life persons so that each partition contains records pertaining to a
unique person. Most of the existing solutions for this task operate in a batch
mode, where all records to be disambiguated are initially available to the
algorithm. However, more realistic settings require that the name
disambiguation task be performed in an online fashion, in addition to, being
able to identify records of new ambiguous entities having no preexisting
records. In this work, we propose a Bayesian non-exhaustive classification
framework for solving online name disambiguation task. Our proposed method uses
a Dirichlet process prior with a Normal * Normal * Inverse Wishart data model
which enables identification of new ambiguous entities who have no records in
the training data. For online classification, we use one sweep Gibbs sampler
which is very efficient and effective. As a case study we consider
bibliographic data in a temporal stream format and disambiguate authors by
partitioning their papers into homogeneous groups. Our experimental results
demonstrate that the proposed method is better than existing methods for
performing online name disambiguation task.Comment: to appear in CIKM 201