Several papers have appeared criticizing the kappa coefficient because of its tendency to fluctuate with sample base rates. The importance of these criticisms is difficult to evaluate because they are presented with regards to a highly specific model of diagnostic decision making. In this article, diagnostic decision making is viewed as a special case of signal detection theory. Each diagnostic process is characterized by a function that relates the probability of a case receiving a positive diagno-sis to the severity or salience of symptoms. The shape of this diagnosability curve greatly affects the value of kappa obtained in a study of interrater reliability, how it changes in response to variation in the base rates, and how closely it corresponds to the validity of diagnostic decisions. The common practice of evaluating a diagnostic procedure, when criterion diagnoses for comparison are unavail-able, on the basis of the magnitude of the kappa coefficient observed in a reliability study is question-able. New methods for measuring interrater agreement are necessary, and possible directions for research in this area are discussed. The kappa coefficient (Cohen, 1960) is generally regarded as the statistic of choice for measuring agreement on ratings made on a nominal scale. It is relatively easy to calculate, can be ap
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.