1 research outputs found
How Complex is your classification problem? A survey on measuring classification complexity
Extracting characteristics from the training datasets of classification
problems has proven effective in a number of meta-analyses. Among them,
measures of classification complexity can estimate the difficulty in separating
the data points into their expected classes. Descriptors of the spatial
distribution of the data and estimates of the shape and size of the decision
boundary are among the existent measures for this characterization. This
information can support the formulation of new data-driven pre-processing and
pattern recognition techniques, which can in turn be focused on challenging
characteristics of the problems. This paper surveys and analyzes measures which
can be extracted from the training datasets in order to characterize the
complexity of the respective classification problems. Their use in recent
literature is also reviewed and discussed, allowing to prospect opportunities
for future work in the area. Finally, descriptions are given on an R package
named Extended Complexity Library (ECoL) that implements a set of complexity
measures and is made publicly available.Comment: Survey pape