47 research outputs found
Hedging predictions in machine learning
Recent advances in machine learning make it possible to design efficient
prediction algorithms for data sets with huge numbers of parameters. This paper
describes a new technique for "hedging" the predictions output by many such
algorithms, including support vector machines, kernel ridge regression, kernel
nearest neighbours, and by many other state-of-the-art methods. The hedged
predictions for the labels of new objects include quantitative measures of
their own accuracy and reliability. These measures are provably valid under the
assumption of randomness, traditional in machine learning: the objects and
their labels are assumed to be generated independently from the same
probability distribution. In particular, it becomes possible to control (up to
statistical fluctuations) the number of erroneous predictions by selecting a
suitable confidence level. Validity being achieved automatically, the remaining
goal of hedged prediction is efficiency: taking full account of the new
objects' features and other available information to produce as accurate
predictions as possible. This can be done successfully using the powerful
machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with
discussion and rejoinder) is to appear in "The Computer Journal
Multi Split Conformal Prediction
Split conformal prediction is a computationally efficient method for
performing distribution-free predictive inference in regression. It involves,
however, a one-time random split of the data, and the result depends on the
particular split. To address this problem, we propose multi split conformal
prediction, a simple method based on Markov's inequality to aggregate single
split conformal prediction intervals across multiple splits.Comment: 12 pages, 1 figure, 2 tabl
Assesment of Stroke Risk Based on Morphological Ultrasound Image Analysis With Conformal Prediction
Non-invasive ultrasound imaging of carotid plaques allows for the development of plaque image analysis in order to assess the risk of stroke. In our work, we provide reliable confidence measures for the assessment of stroke risk, using the Conformal Prediction framework. This framework provides a way for assigning valid confidence measures to predictions of classical machine learning algorithms. We conduct experiments on a dataset which contains morphological features derived from ultrasound images of atherosclerotic carotid plaques, and we evaluate the results of four different Conformal Predictors (CPs). The four CPs are based on Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Naive Bayes classification (NBC), and k-Nearest Neighbours (k-NN). The results given by all CPs demonstrate the reliability and usefulness of the obtained confidence measures on the problem of stroke risk assessment
Confidence Calibration for Systems with Cascaded Predictive Modules
Existing conformal prediction algorithms estimate prediction intervals at
target confidence levels to characterize the performance of a regression model
on new test samples. However, considering an autonomous system consisting of
multiple modules, prediction intervals constructed for individual modules fall
short of accommodating uncertainty propagation over different modules and thus
cannot provide reliable predictions on system behavior. We address this
limitation and present novel solutions based on conformal prediction to provide
prediction intervals calibrated for a predictive system consisting of cascaded
modules (e.g., an upstream feature extraction module and a downstream
regression module). Our key idea is to leverage module-level validation data to
characterize the system-level error distribution without direct access to
end-to-end validation data. We provide theoretical justification and empirical
experimental results to demonstrate the effectiveness of proposed solutions. In
comparison to prediction intervals calibrated for individual modules, our
solutions generate improved intervals with more accurate performance guarantees
for system predictions, which are demonstrated on both synthetic systems and
real-world systems performing overlap prediction for indoor navigation using
the Matterport3D dataset
Least Ambiguous Set-Valued Classifiers with Bounded Error Levels
In most classification tasks there are observations that are ambiguous and
therefore difficult to correctly label. Set-valued classifiers output sets of
plausible labels rather than a single label, thereby giving a more appropriate
and informative treatment to the labeling of ambiguous instances. We introduce
a framework for multiclass set-valued classification, where the classifiers
guarantee user-defined levels of coverage or confidence (the probability that
the true label is contained in the set) while minimizing the ambiguity (the
expected size of the output). We first derive oracle classifiers assuming the
true distribution to be known. We show that the oracle classifiers are obtained
from level sets of the functions that define the conditional probability of
each class. Then we develop estimators with good asymptotic and finite sample
properties. The proposed estimators build on existing single-label classifiers.
The optimal classifier can sometimes output the empty set, but we provide two
solutions to fix this issue that are suitable for various practical needs.Comment: Final version to be published in the Journal of the American
Statistical Association at
https://www.tandfonline.com/doi/abs/10.1080/01621459.2017.1395341?journalCode=uasa2
Using nondeterministic learners to alert on coffee rust disease
Motivated by an agriculture case study, we discuss how to learn functions able to predict whether the value of a continuous target variable will be greater than a given threshold. In the application studied, the aim was to alert on high incidences of coffee rust, the main coffee crop disease in the world. The objective is to use chemical prevention of the disease only when necessary in order to obtain healthier quality products and reductions in costs and environmental impact. In this context, the costs of misclassifications are not symmetrical: false negative predictions may lead to the loss of coffee crops. The baseline approach for this problem is to learn a regressor from the variables that records the factors affecting the appearance and growth of the disease. However, the number of errors is too high to obtain a reliable alarm system. The approaches explored here try to learn hypotheses whose predictions are allowed to return intervals rather than single points. Thus,in addition to alarms and non-alarms, these predictors identify situations with uncertain classification, which we call warnings. We present 3 different implementations: one based on regression, and 2 more based on classifiers. These methods are compared using a framework where the costs of false negatives are higher than that of false positives, and both are higher than the cost of warning prediction