In two-group discriminant analysis, the Neyman-Pearson Lemma establishes that the ROC, receiver operating characteristic, curve for an arbitrary linear function is everywhere below the ROC curve for the true likelihood ratio. The weighted area between these two curves can be used as a risk function for finding good discriminant functions. The weight function corresponds to the objective of the analysis, for example to minimise the expected cost of misclassification, or to maximise the area under the ROC. The resulting discriminant functions can be estimated by iteratively reweighted logistic regression. We investigate some asymptotic properties in the 'near-logistic' setting, where we assume the covariates have been chosen such that a linear function gives a reasonable, but not necessarily exact, approximation to the true log likelihood ratio. Some examples are discussed, including a study of medical diagnosis in breast cytology
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.