Author disambiguation using multi-aspect similarity indicators

A Somers; B Cassiman; DW Aksnes; Edwin Horlings; FJ Damerau; G Pasterkamp; HF Moed; HH Do; IS Kang; J Huang; J Nicolaisen; J Raffo; J Whittaker; L Leydesdorff; L Leydesdorff; L Tang; N Onodera; P Healey; Peter van den Besselaar; R Wagner-Döbler; T Bates; Thomas Gurney; TJ Phelan; V Yank; VD Blondel; VI Levenshtein; Y Matsuo

Author disambiguation using multi-aspect similarity indicators

Authors: A Somers
B Cassiman
DW Aksnes
Edwin Horlings
FJ Damerau
G Pasterkamp
HF Moed
HH Do
IS Kang
J Huang
J Nicolaisen
J Raffo
J Whittaker
L Leydesdorff
L Leydesdorff
L Tang
N Onodera
P Healey
Peter van den Besselaar
R Wagner-Döbler
T Bates
Thomas Gurney
TJ Phelan
V Yank
VD Blondel
VI Levenshtein
Y Matsuo
Publication date: 1 January 2011
Publisher: Springer Netherlands
Doi

Abstract

Key to accurate bibliometric analyses is the ability to correctly link individuals to their corpus of work, with an optimal balance between precision and recall. We have developed an algorithm that does this disambiguation task with a very high recall and precision. The method addresses the issues of discarded records due to null data fields and their resultant effect on recall, precision and F-measure results. We have implemented a dynamic approach to similarity calculations based on all available data fields. We have also included differences in author contribution and age difference between publications, both of which have meaningful effects on overall similarity measurements, resulting in significantly higher recall and precision of returned records. The results are presented from a test dataset of heterogeneous catalysis publications. Results demonstrate significantly high average F-measure scores and substantial improvements on previous and stand-alone techniques