The selection of the best classification algorithm for a given dataset is a
very widespread problem, occuring each time one has to choose a classifier to
solve a real-world problem. It is also a complex task with many important
methodological decisions to make. Among those, one of the most crucial is the
choice of an appropriate measure in order to properly assess the classification
performance and rank the algorithms. In this article, we focus on this specific
task. We present the most popular measures and compare their behavior through
discrimination plots. We then discuss their properties from a more theoretical
perspective. It turns out several of them are equivalent for classifiers
comparison purposes. Futhermore. they can also lead to interpretation problems.
Among the numerous measures proposed over the years, it appears that the
classical overall success rate and marginal rates are the more suitable for
classifier comparison task