52,429 research outputs found

    ROC curves for regression

    Full text link
    “NOTICE: this is the author’s version of a work that was accepted for publication in Pattern Recognition. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Volume 46, Issue 12, December 2013, Pages 3395–3411 DOI: 10.1016/j.patcog.2013.06.014Receiver Operating Characteristic (ROC) analysis is one of the most popular tools for the visual assessment and understanding of classifier performance. In this paper we present a new representation of regression models in the so-called regression ROC (RROC) space. The basic idea is to represent over-estimation against under-estimation. The curves are just drawn by adjusting a shift, a constant that is added (or subtracted) to the predictions, and plays a similar role as a threshold in classification. From here, we develop the notions of optimal operating condition, convexity, dominance, and explore several evaluation metrics that can be shown graphically, such as the area over the RROC curve (AOC). In particular, we show a novel and significant result: the AOC is equivalent to the error variance. We illustrate the application of RROC curves to resource estimation, namely the estimation of software project effort.I would like to thank Peter Flach and Nicolas Lachiche for some very useful comments and corrections on earlier versions of this paper, especially the suggestion of drawing normalised curves (dividing x-axis and y-axis by n). This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, GVA project Prometeo/2008/051, the COST - European Cooperation in the field of Scientific and Technical Research IC0801 AT, and the REFRAME project granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the respective national research councils and ministries.Hernández-Orallo, J. (2013). ROC curves for regression. Pattern Recognition. 46(12):3395-3411. https://doi.org/10.1016/j.patcog.2013.06.014S33953411461

    Estimation and Comparison of Receiver Operating Characteristic Curves

    Get PDF
    The receiver operating characteristic (ROC) curve displays the capacity of a marker or diagnostic test to discriminate between two groups of subjects, cases versus controls. We present a comprehensive suite of Stata commands for performing ROC analysis. Non-parametric, semiparametric and parametric estimators are calculated. Comparisons between curves are based on the area or partial area under the ROC curve. Alternatively pointwise comparisons between ROC curves or inverse ROC curves can be made. Options to adjust these analyses for covariates, and to perform ROC regression are described in a companion article. We use a unified framework by representing the ROC curve as the distribution of the marker in cases after standardizing it to the control reference distribution

    A Linear Regression Framework for Receiver Operating Characteristic(ROC) Curve Analysis

    Get PDF
    In the field of medical diagnostic testing, the receiver operating characteristics(ROC) curve has long been used as a standard statistical tool to assess the accuracy of tests that yield continuous results. Although previous research in this area focused mostly on estimating the ROC curve, recently it has been recognized that the accuracy of a given test may fluctuate depending on certain factors, which motivates modelling covariate effects on the ROC curve. Comparing the corresponding ROC curves between two or more tests is a special case of covariate effect modelling. In this manuscript, we introduce a linear regression framework to model covariate effect on the ROC curve. We assumes the ROC curve takes a specific parametric form for each covariate level and the covariate effect reflects on the parameters of the curves. The new method provides an unified approach for the ROC curve analysis and it is intuitive and easy to apply. Two real data sets are used to illustrate the new approach

    ROC curve regression analysis: the use of ordinal regression models for diagnostic test assessment.

    Get PDF
    Diagnostic tests commonly are characterized by their true positive (sensitivity) and true negative (specificity) classification rates, which rely on a single decision threshold to classify a test result as positive. A more complete description of test accuracy is given by the receiver operating characteristic (ROC) curve, a graph of the false positive and true positive rates obtained as the decision threshold is varied. A generalized regression methodology, which uses a class of ordinal regression models to estimate smoothed ROC curves has been described. Data from a multi-institutional study comparing the accuracy of magnetic resonance (MR) imaging with computed tomography (CT) in detecting liver metastases, which are ideally suited for ROC regression analysis, are described. The general regression model is introduced and an estimate for the area under the ROC curve and its standard error using parameters of the ordinal regression model is given. An analysis of the liver data that highlights the utility of the methodology in parsimoniously adjusting comparisons for covariates is presented

    Frequent Use of Fresh Frozen Plasma Is a Risk Factor for Venous Thrombosis in Extremely Low Birth Weight Infants: A Matched Case-control Study

    Get PDF
    Percutaneously inserted central catheters (PICCs) are often used in neonatal medicine. Venous thrombosis (VT) is one of the complications associated with PICC use. According to some reports, fresh frozen plasma (FFP) may be a risk factor for VT. The purpose of this study was to determine whether FFP use is associated with VT in extremely low birth weight infants (ELBWIs). We performed a matched case-control study on risk factors for VT in ELBWIs born over a period of 5 years in the neonatal intensive care unit of a tertiary hospital. Controls were infants from the unit matched for gestational age and birth weight. We performed univariate analyses and created receiver operating characteristic (ROC) curves for the cut-off values of continuous parameters such as FFP. We also conducted multivariate conditional logistic regression analysis and calculated adjusted odds ratios and their 95% confidence intervals. Thirteen VT cases and 34 matched controls were examined. Using an ROC curve, FFP by day 5>50mL/kg was selected as the cut-off value. In multivariate conditional logistic regression analysis, FFP by day 5>50mL/kg exhibited an adjusted odds ratio of 5.88 (95% confidence interval:1.12-41.81, p=0.036). FFP by day 5>50mL/kg may be a risk factor for VT in ELBWIs

    Semiparametric Receiver Operating Characteristic Analysis to Evaluate Biomarkers for Disease

    Get PDF
    The receiver operating characteristic (ROC) curve is a popular method for characterizing the accuracy of diagnostic tests when test results are not binary. Various methodologies for estimating and comparing ROC curves have been developed. One approach, due to Pepe, uses a parametric regression model with the baseline function specified up to a finite-dimensional parameter. In this article we extend the regression models by allowing arbitrary nonparametric baseline functions. We also provide asymptotic distribution theory and procedures for making statistical inference. We illustrate our approach with dataset from a prostate cancer biomarker study. Simulation studies suggest that the extra flexibility inherent in the semiparametric method is gained with little loss in statistical efficiency

    Semiparametric Estimation of Time-Dependent: ROC Curves for Longitudinal Marker Data

    Get PDF
    One approach to evaluating the strength of association between a longitudinal marker process and a key clinical event time is through predictive regression methods such as a time-dependent covariate hazard model. For example, a time-varying covariate Cox model specifies the instantaneous risk of the event as a function of the time-varying marker and additional covariates. In this manuscript we explore a second complementary approach which characterizes the distribution of the marker as a function of both the measurement time and the ultimate event time. Our goal is to flexibly extend the standard diagnostic accuracy concepts of sensitivity and specificity to explicitly recognize both the timing of the marker measurement and the timing of disease. The accuracy of a longitudinal marker can be fully characterized using time-dependent receiver operating characteristic (ROC) curves. We detail a semiparametric estimation method for time-dependent ROC curves that adopts a regression quantile approach for longitudinal data introduced by Heagerty and Pepe (1999}. We extend the work of Heagerty and Pepe (1999} by developing asymptotic distribution theory for the ROC estimators where the distributional shape for the marker is allowed to depend on covariates. To illustrate our method, we analyze pulmonary function measurements among cystic fibrosis subjects to assemble a case-control study and estimate ROC curves that assess how well the pulmonary function measurement can distinguish subjects that progress to death from subjects that remain alive. Comparing the results from our semiparametric analysis to a fully parametric method discussed by Etzioni and Pepe (1999} suggests that the ability to relax distributional assumptions may be important in practice

    Injury Severity Score (ISS) vs. ICD-derived Injury Severity Score (ICISS) in a patient population treated in a designated Hong Kong trauma centre

    Get PDF
    Trauma and Injury Severity Score (TRISS) has been the benchmark of mortality risk in trauma centers for over 30 years. TRISS utilizes the Injury Severity Score (ISS) as an index of anatomical injury. This study investigated the efficacy of a new type of index of anatomical injury called the ICD-derived Injury Severity Score (ICISS) compared to the ISS using a logistic regression analysis and a global chi-square test of the areas under the Receiver Operator Characteristic (ROC) curves. We found that the empirically derived ICISS performed as well as the consensus derived ISS with no statistical differences between their respective area under the ROC curves
    corecore