Binary credal classification under sparsity constraints.

Abstract

Binary classification is a well known problem in statistics. Besides classical methods, several techniques such as the naive credal classifier (for categorical data) and imprecise logistic regression (for continuous data) have been proposed to handle sparse data. However, a convincing approach to the classification problem in high dimensional problems (i.e., when the number of attributes is larger than the number of observations) is yet to be explored in the context of imprecise probability. In this article, we propose a sensitivity analysis based on penalised logistic regression scheme that works as binary classifier for high dimensional cases. We use an approach based on a set of likelihood functions (i.e. an imprecise likelihood, if you like), that assigns a set of weights to the attributes, to ensure a robust selection of the important attributes, whilst training the model at the same time, all in one fell swoop. We do a sensitivity analysis on the weights of the penalty term resulting in a set of sparse constraints which helps to identify imprecision in the dataset

    Similar works