1 research outputs found

    Correlation, prediction and ranking of evaluation metrics in information retrieval

    No full text
    41st European Conference on Information Retrieval, ECIR ( 2019: Cologne; Germany )Given limited time and space, IR studies often report few evaluation metrics which must be carefully selected. To inform such selection, we first quantify correlation between 23 popular IR metrics on 8 TREC test collections. Next, we investigate prediction of unreported metrics: given 1–3 metrics, we assess the best predictors for 10 others. We show that accurate prediction of MAP, P@10, and RBP can be achieved using 2–3 other metrics. We further explore whether high-cost evaluation measures can be predicted using low-cost measures. We show RBP(p = 0.95) at cutoff depth 1000 can be accurately predicted given measures computed at depth 30. Lastly, we present a novel model for ranking evaluation metrics based on covariance, enabling selection of a set of metrics that are most informative and distinctive. A greedy-forward approach is guaranteed to yield sub-modular results, while an iterative-backward method is empirically found to achieve the best results. © Springer Nature Switzerland AG 2019.Acknowledgements. This work was made possible by NPRP grant# NPRP 7-1313-1-245 from the Qatar National Research Fund (a member of the Qatar Foundation). The statements made herein are solely the responsibility of the authors
    corecore