Search CORE

464 research outputs found

Effective Unsupervised Author Disambiguation with Relative Frequencies

Author: Backes Tobias
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/08/2018
Field of study

This work addresses the problem of author name homonymy in the Web of Science. Aiming for an efficient, simple and straightforward solution, we introduce a novel probabilistic similarity measure for author name disambiguation based on feature overlap. Using the researcher-ID available for a subset of the Web of Science, we evaluate the application of this measure in the context of agglomeratively clustering author mentions. We focus on a concise evaluation that shows clearly for which problem setups and at which time during the clustering process our approach works best. In contrast to most other works in this field, we are sceptical towards the performance of author name disambiguation methods in general and compare our approach to the trivial single-cluster baseline. Our results are presented separately for each correct clustering size as we can explain that, when treating all cases together, the trivial baseline and more sophisticated approaches are hardly distinguishable in terms of evaluation results. Our model shows state-of-the-art performance for all correct clustering sizes without any discriminative training and with tuning only one convergence parameter.Comment: Proceedings of JCDL 201

arXiv.org e-Print Archive

Crossref

Author disambiguation using multi-aspect similarity indicators

Author: A Somers
B Cassiman
DW Aksnes
Edwin Horlings
FJ Damerau
G Pasterkamp
HF Moed
HH Do
IS Kang
J Huang
J Nicolaisen
J Raffo
J Whittaker
L Leydesdorff
L Leydesdorff
L Tang
N Onodera
P Healey
Peter van den Besselaar
R Wagner-Döbler
T Bates
Thomas Gurney
TJ Phelan
V Yank
VD Blondel
VI Levenshtein
Y Matsuo
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

Key to accurate bibliometric analyses is the ability to correctly link individuals to their corpus of work, with an optimal balance between precision and recall. We have developed an algorithm that does this disambiguation task with a very high recall and precision. The method addresses the issues of discarded records due to null data fields and their resultant effect on recall, precision and F-measure results. We have implemented a dynamic approach to similarity calculations based on all available data fields. We have also included differences in author contribution and age difference between publications, both of which have meaningful effects on overall similarity measurements, resulting in significantly higher recall and precision of returned records. The results are presented from a test dataset of heterogeneous catalysis publications. Results demonstrate significantly high average F-measure scores and substantial improvements on previous and stand-alone techniques

Crossref

VU Research Portal

Springer - Publisher Connector

PubMed Central

KNAW Repository

Combining machine learning and human judgment in author disambiguation

Author: Jianling Cui
Qinghua Zheng
Yanan Qian
Yunhua Hu
Zaiqing Nie
Publication venue
Publication date: 03/04/2020
Field of study

ABSTRACT Author disambiguation in digital libraries becomes increasingly difficult as the number of publications and consequently the number of ambiguous author names keep growing. The fully automatic author disambiguation approach could not give satisfactory results due to the lack of signals in many cases. Furthermore, human judgment on the basis of automatic algorithms is also not suitable because the automatically disambiguated results are often mixed and not understandable for humans. In this paper, we propose a Labeling Oriented Author Disambiguation approach, called LOAD, to combine machine learning and human judgment together in author disambiguation. LOAD exploits a framework which consists of high precision clustering, high recall clustering, and top dissimilar clusters selection and ranking. In the framework, supervised learning algorithms are used to train the similarity functions between publications and a clustering algorithm is further applied to generate clusters. To validate the effectiveness and efficiency of the proposed LOAD approach, comprehensive experiments are conducted. Comparing to conventional author disambiguation algorithms, the LOAD yields much more accurate results to assist human labeling. Further experiments show that the LOAD approach can save labeling time dramatically

CiteSeerX

Researchers’ publication patterns and their use for author disambiguation

Author: CT Zhang
DW Aksnes
G Lewison
GA Barnett
HA Zuckerman
J Wang
JE Hirsch
JR Cole
KW Boyack
L Egghe
M Enserink
M Levin
M Schreiber
N Aswani
NR Smalheiser
P Jensen
RG Cota
RK Merton
S Wooding
T Gurney
V Larivière
VI Torvik
Y Gingras
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Over the recent years, we are witnessing an increase of the need for advanced bibliometric indicators on individual researchers and research groups, for which author disambiguation is needed. Using the complete population of university professors and researchers in the Canadian province of Québec (N=13,479), of their papers as well as the papers authored by their homonyms, this paper provides evidence of regularities in researchers’ publication patterns. It shows how these patterns can be used to automatically assign papers to individual and remove papers authored by their homonyms. Two types of patterns were found: 1) at the individual researchers’ level and 2) at the level of disciplines. On the whole, these patterns allow the construction of an algorithm that provides assignation information on at least one paper for 11,105 (82.4%) out of all 13,479 researchers—with a very low percentage of false positives (3.2%)

CiteSeerX

Crossref

Dépôt Institutionnel Numérique

Identifying experts through a framework for knowledge extraction from public online sources

Author: Buelens Simon
De Turck Filip
Hristoskova Anna
Putman Mattias
Tourw Tom
Tsiporkova Elena
Publication venue: Ghent University, Department of Information technology
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

A graph-based disambiguation approach for construction of an expert repository from public online sources

Author: Buelens S
De Turck Filip
Hristoskova Anna
Putman M
Tourwé T
Tsiporkova E
Publication venue
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

The Impact of Name-Matching and Blocking on Author Disambiguation

Author: Davidson Ian
De Carvalho Ana Paula
Galvez Carmen
Ioannidis Yannis E.
Kardes Hakan
Kurien Biji T.
Publication venue: New York
Publication date: 01/01/2018
Field of study

In this work, we address the problem of blocking in the context of author name disambiguation. We describe a framework that formalizes different ways of name-matching to determine which names could potentially refer to the same author. We focus on name variations that follow from specifying a name with different completeness (i.e. full first name or only initial). We extend this framework by a simple way to define traditional, new and custom blocking schemes. Then, we evaluate different old and new schemes in the Web of Science. In this context we define and compare a new type of blocking schemes. Based on these results, we discuss the question whether name-matching can be used in blocking evaluation as a replacement of annotated author identifiers. Finally, we argue that blocking can have a strong impact on the application and evaluation of author disambiguation

Crossref

ZENODO

SSOAR - Social Science Open Access Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY