4 research outputs found
How reliable are unsupervised author disambiguation algorithms in the assessment of research organization performance?
The paper examines extent of bias in the performance rankings of research
organisations when the assessments are based on unsupervised author-name
disambiguation algorithms. It compares the outcomes of a research performance
evaluation exercise of Italian universities using the unsupervised approach by
Caron and van Eck (2014) for derivation of the universities' research staff,
with those of a benchmark using the supervised algorithm of D'Angelo,
Giuffrida, and Abramo (2011), which avails of input data. The methodology
developed could be replicated for comparative analyses in other frameworks of
national or international interest, meaning that practitioners would have a
precise measure of the extent of distortions inherent in any evaluation
exercises using unsupervised algorithms. This could in turn be useful in
informing policy-makers' decisions on whether to invest in building national
research staff databases, instead of settling for the unsupervised approaches
with their measurement biases
Author disambiguation in PubMed: Evidence on the precision and recall of author-ity among NIH-funded scientists
We examined the usefulness (precision) and completeness (recall) of the Author-ity author disambiguation for PubMed articles by associating articles with scientists funded by the National Institutes of Health (NIH). In doing so, we exploited established unique identifiers-Principal Investigator (PI) IDs-that the NIH assigns to funded scientists. Analyzing a set of 36,987 NIH scientists who received their first R01 grant between 1985 and 2009, we identified 355,921 articles appearing in PubMed that would allow us to evaluate the precision and recall of the Author-ity disambiguation. We found that Author-ity identified the NIH scientists with 99.51% precision across the articles. It had a corresponding recall of 99.64%. Precision and recall, moreover, appeared stable across common and uncommon last names, across ethnic backgrounds, and across levels of scientist productivity