1 research outputs found
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
Conformal Prediction is a framework that produces prediction intervals based
on the output from a machine learning algorithm. In this paper we explore the
case when training data is made up of multiple parts available in different
sources that cannot be pooled. We here consider the regression case and propose
a method where a conformal predictor is trained on each data source
independently, and where the prediction intervals are then combined into a
single interval. We call the approach Non-Disclosed Conformal Prediction
(NDCP), and we evaluate it on a regression dataset from the UCI machine
learning repository using support vector regression as the underlying machine
learning algorithm, with varying number of data sources and sizes. The results
show that the proposed method produces conservatively valid prediction
intervals, and while we cannot retain the same efficiency as when all data is
used, efficiency is improved through the proposed approach as compared to
predicting using a single arbitrarily chosen source.Comment: Accepted to 8th Symposium on Conformal and Probabilistic Prediction
with Applications, Golden Sands, Bulgaria, 201