Multireader, Multicase Receiver Operating Characteristic Analysis: An Empirical Comparison of Five Methods 1

Abstract

Rationale and Objectives. Several statistical methods have been developed for analyzing multireader, multicase (MRMC) receiver operating characteristic (ROC) studies. The objective of this article is to increase awareness of these methods and determine if their results are concordant for published datasets. Materials and Methods. Data from three previously published studies were reanalyzed using five MRMC methods. For each method the 95 % confidence intervals (CIs) for the mean of the readers ’ ROC areas for each diagnostic test, the P value for the comparison of the diagnostic tests ’ mean accuracies, and the 95 % CIs for the mean difference in ROC areas of the diagnostic tests were reported. Results. Important differences in P values and CIs were seen when using parametric versus nonparametric estimates of accuracy, and there were the expected differences for random-reader versus fixed-reader models. Controlling for these differences, the Dorfman-Berbaum-Metz (DBM), Obuchowski-Rockette, Beiden-Wagner-Campbell, and Song’s multivariate Wilcoxon-Mann-Whitney (WMW) methods gave almost identical results for the fixed-reader model. For the randomreader model, the DBM, Obuchowski-Rockette, and Beiden-Wagner-Campbell methods yielded approximately the same inferences, but the CIs for the Beiden-Wagner-Campbell method tend to be broader. Ishwaran’s hierarchical ROC sometimes yielded significance not found with other methods. Song’s modification of DBM’s jack-knifing algorithm sometimes led to different conclusions than the original DBM algorithm

    Similar works

    Full text

    thumbnail-image

    Available Versions