5 research outputs found
Reliable Probabilistic Classification with Neural Networks
Venn Prediction (VP) is a new machine learning framework for producing
well-calibrated probabilistic predictions. In particular it provides
well-calibrated lower and upper bounds for the conditional probability of an
example belonging to each possible class of the problem at hand. This paper
proposes five VP methods based on Neural Networks (NNs), which is one of the
most widely used machine learning techniques. The proposed methods are
evaluated experimentally on four benchmark datasets and the obtained results
demonstrate the empirical well-calibratedness of their outputs and their
superiority over the outputs of the traditional NN classifier
Confidence in Predictions from Random Tree Ensembles
Obtaining an indication of confidence of predictions is desirable for many data mining applications. Predictions complemented with confidence levels can inform on the certainty or extent of reliability that may be associated with the prediction. This can be useful in varied application contexts where model outputs form the basis for potentially costly decisions, and in general across risk sensitive applications. The conformal prediction framework presents a novel approach for obtaining valid confidence measures associated with predictions from machine learning algorithms. Confidence levels are obtained from the underlying algorithm, using a non-conformity measure which indicates how 'atypical' a given example set is. The non-conformity measure is key to determining the usefulness and efficiency of the approach. This paper considers inductive conformal prediction in the context of random tree ensembles like random forests, which have been noted to perform favorably across problems. Focusing on classification tasks, and considering realistic data contexts including class imbalance, we develop non-conformity measures for assessing the confidence of predicted class labels from random forests. We examine the performance of these measures on multiple datasets. Results demonstrate the usefulness and validity of the measures, their relative differences, and highlight the effectiveness of conformal prediction random forests for obtaining predictions with associated confidence