32,257 research outputs found

    Computationally Efficient Confidence Intervals for Cross-validated Area Under the ROC Curve Estimates

    Get PDF
    In binary classification problems, the area under the ROC curve (AUC), is an effective means of measuring the performance of your model. Most often, cross-validation is also used, in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we must obtain an estimate for its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, calculating the cross-validated AUC on even a relatively small data set can still require a large amount of computation time. Thus, when the processes of obtaining a single estimate for cross-validated AUC is significant, the bootstrap, as a means of variance estimation, can be computationally intractable. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC

    Confidence Bands for ROC Curves: Methods and an Empirical Study

    Get PDF
    In this paper we study techniques for generating and evaluating confidence bands on ROC curves. ROC curve evaluation is rapidly becoming a commonly used evaluation metric in machine learning, although evaluating ROC curves has thus far been limited to studying the area under the curve (AUC) or generation of one-dimensional confidence intervals by freezing one variableā€”the false-positive rate, or threshold on the classification scoring function. Researchers in the medical field have long been using ROC curves and have many well-studied methods for analyzing such curves, including generating confidence intervals as well as simultaneous confidence bands. In this paper we introduce these techniques to the machine learning community and show their empirical fitness on the Covertype data setā€”a standard machine learning benchmark from the UCI repository. We show how some of these methods work remarkably well, others are too loose, and that existing machine learning methods for generation of 1-dimensional confidence intervals do not translate well to generation of simultanous bandsā€”their bands are too tight.NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

    Confidence Bands for Roc Curves

    Get PDF
    In this paper we study techniques for generating and evaluating confidence bands on ROC curves. ROC curve evaluation is rapidly becoming a commonly used evaluation metric in machine learning, although evaluating ROC curves has thus far been limited to studying the area under the curve (AUC) or generation of one-dimensional confidence intervals by freezing one variableĆ¢ the false-positive rate, or threshold on the classification scoring function. Researchers in the medical field have long been using ROC curves and have many well-studied methods for analyzing such curves, including generating confidence intervals as well as simultaneous confidence bands. In this paper we introduce these techniques to the machine learning community and show their empirical fitness on the Covertype data setĆ¢a standard machine learning benchmark from the UCI repository. We show how some of these methods work remarkably well, others are too loose, and that existing machine learning methods for generation of 1-dimensional confidence intervals do not translate well to generation of simultaneous bandsĆ¢their bands are too tight.Information Systems Working Papers Serie

    Parameters behind "nonparametric" statistics: Kendall's tau,Somers' D and median differences

    Get PDF
    So-called "nonparametric" statistical methods are often in fact based on population parameters, which can be estimated (with confidence limits) using the corresponding sample statistics. This article reviews the uses of three such parameters, namely Kendall's tau, Somers' D and the Hodges-Lehmann median difference. Confidence intervals for these are demonstrated using the somersd package. It is argued that confidence limits for these parameters, and their differences,are more informative than the traditional practice of reporting only p-values. These three parameters are also important in defining other tests and parameters, such as the Wilcoxon test, the area under the receiver operating characteristic (ROC) curve, Harrell's C, and the Theil median slope. Copyright 2002 by Stata Corporation.confidence intervals, Gehan test, Harrell's C , Hodges-Lehmann median difference, Kendall's tau, nonparametric methods, rank correlation, rank-sum test, ROC area, Somers' D, Theil median slope, Wilcoxon test

    Measuring the Discriminative Power of Rating Systems

    Get PDF
    Assessing the discriminative power of rating systems is an important question to banks and to regulators. In this article we analyze the Cumulative Accuracy Profile (CAP) and the Receiver Operating Characteristic (ROC) which are both commonly used in practice. We give a test-theoretic interpretation for the concavity of the CAP and the ROC curve and demonstrate how this observation can be used for more efficiently exploiting the informational contents of accounting ratios. Furthermore, we show that two popular summary statistics of these concepts, namely the Accuracy Ratio and the area under the ROC curve, contain the same information and we analyse the statistical properties of these measures. We show in detail how to identify accounting ratios with high discriminative power, how to calculate confidence intervals for the area below the ROC curve, and how to test if two rating models validated on the same data set are different. All concepts are illustrated by applications to real data. --Validation,Rating Models,Credit Analysis

    Evaluation of Area under the Constant Shape Bi-Weibull ROC Curve

    Get PDF
    The Receiver Operating Characteristic (ROC) curve generated based on assuming a constant shape Bi-Weibull distribution is studied. In the context of ROC curve analysis, it is assumed that biomarker values from controls and cases follow some specific distribution and the accuracy is evaluated by using the ROC model developed from that specified distribution. This article assumes that the biomarker values from the two groups follow Weibull distributions with equal shape parameter and different scale parameters. The ROC model, area under the ROC curve (AUC), asymptotic and bootstrap confidence intervals for the AUC are derived. Theoretical results are validated by simulation studies

    Semi-Parametric Inference for the Partial Area Under the ROC Curve

    Get PDF
    Diagnostic tests are central in the field of modern medicine. One of the main factors for interpreting a diagnostic test is the discriminatory accuracy. For a continuous-scale diagnostic test, the area under the receiver operating characteristic (ROC) curve, AUC, is a useful one-number summary index for the diagnostic accuracy of the test. When only a particular region of the ROC curve would be of interest, the partial AUC (pAUC) is a more appropriate index for the diagnostic accuracy. In this thesis, we develop seven confidence intervals for the pAUC under the semi-parametric models for the diseased and non-diseased populations by using the normal approximation, bootstrap and empirical likelihood methods. In addition, we conduct simulation studies to compare the finite sample performance of the proposed confidence intervals for the pAUC. A real example is also used to illustrate the application of the recommended intervals

    Efficient Calculation of Jackknife Confidence Intervals for Rank Statistics

    Get PDF
    An algorithm is presented for calculating concordance-discordance totals in a time of order N log N , where N is the number of observations, using a balanced binary search tree. These totals can be used to calculate jackknife estimates and confidence limits in the same time order for a very wide range of rank statistics, including Kendall's tau, Somers' D, Harrell's c, the area under the receiver operating characteristic (ROC) curve, the Gini coefficient, and the parameters underlying the sign and rank-sum tests. A Stata package is introduced for calculating confidence intervals for these rank statistics using this algorithm, which has been implemented in the Mata compilable matrix programming language supplied with Stata.

    Novel Nonparametric Methods For ROC Curves

    Get PDF
    The receiver operating characteristic (ROC) curve is a widely used graphical method for evaluating the discriminating power of a diagnostic test or a statistical model in various areas such as epidemiology, industrial quality control and material testing, etc. One important quantitative measure summarizing the ROC curve is the area under the ROC curve (AUC). The accuracy of two diagnostic tests with right censored data can be compared using the difference of two ROC curves and the difference of two AUC\u27s. Moreover, the difference of two volumes under surfaces (VUS) is investigated to compare two treatments for the discrimination of three-category classification data, extending the ROC curve to the ROC surface in the three-dimensional case. A few scientific progresses have been achieved in ROC curves and its related fields over the past decades. In this dissertation, we propose a plug-in empirical likelihood (EL) procedure combining placement values and weighting of inverse probability techniques, to construct stable and precise confidence intervals of the ROC curves, the difference of two ROC curves, the AUC\u27s and the difference of two AUC\u27s with right censoring. We proved that the limiting distribution of the EL ratio is a weighted Ļ‡2\chi^2 distribution. Furthermore, we introduce a jackknife empirical likelihood (JEL) procedure to explore the difference of two correlated VUS\u27s with complete data. We proved that the limiting distribution of the proposed JEL ratio is a Ļ‡2\chi^2 distribution, i.e., the Wilk\u27s theorem holds. Extensive simulation studies demonstrate that the new methods have better performance than the existing methods in terms of coverage probability of confidence intervals in most cases. Finally, the proposed methods are applied to analyze data sets of Primary Biliary Cirrhosis (PBC), Alzheimer\u27s disease, etc

    Confidence Bands for Roc Curves

    Get PDF
    In this paper we study techniques for generating and evaluating confidence bands on ROC curves. ROC curve evaluation is rapidly becoming a commonly used evaluation metric in machine learning, although evaluating ROC curves has thus far been limited to studying the area under the curve (AUC) or generation of one-dimensional confidence intervals by freezing one variableĆ¢ the false-positive rate, or threshold on the classification scoring function. Researchers in the medical field have long been using ROC curves and have many well-studied methods for analyzing such curves, including generating confidence intervals as well as simultaneous confidence bands. In this paper we introduce these techniques to the machine learning community and show their empirical fitness on the Covertype data setĆ¢a standard machine learning benchmark from the UCI repository. We show how some of these methods work remarkably well, others are too loose, and that existing machine learning methods for generation of 1-dimensional confidence intervals do not translate well to generation of simultaneous bandsĆ¢their bands are too tight.Information Systems Working Papers Serie
    • ā€¦
    corecore