216 research outputs found
Guaranteed Coverage Prediction Intervals with Gaussian Process Regression
Gaussian Process Regression (GPR) is a popular regression method, which
unlike most Machine Learning techniques, provides estimates of uncertainty for
its predictions. These uncertainty estimates however, are based on the
assumption that the model is well-specified, an assumption that is violated in
most practical applications, since the required knowledge is rarely available.
As a result, the produced uncertainty estimates can become very misleading; for
example the prediction intervals (PIs) produced for the 95\% confidence level
may cover much less than 95\% of the true labels. To address this issue, this
paper introduces an extension of GPR based on a Machine Learning framework
called, Conformal Prediction (CP). This extension guarantees the production of
PIs with the required coverage even when the model is completely misspecified.
The proposed approach combines the advantages of GPR with the valid coverage
guarantee of CP, while the performed experimental results demonstrate its
superiority over existing methods.Comment: 12 pages. This work has been submitted to IEEE Transactions on
Pattern Analysis and Machine Intelligence for possible publication. Copyright
may be transferred without notice, after which this version may no longer be
accessibl
Conformal off-policy prediction
Off-policy evaluation is critical in a number of applications where new policies need to be evaluated offline before online deployment. Most existing methods focus on the expected return, define the target parameter through averaging and provide a point estimator only. In this paper, we develop a novel procedure to produce reliable interval estimators for a target policy’s return starting from any initial state. Our proposal accounts for the variability of the return around its expectation, focuses on the individual effect and offers valid uncertainty quantification. Our main idea lies in designing a pseudo policy that generates subsamples as if they were sampled from the target policy so that existing conformal prediction algorithms are applicable to prediction interval construction. Our methods are justified by theories, synthetic data and real data from short-video platforms
A review of probabilistic forecasting and prediction with machine learning
Predictions and forecasts of machine learning models should take the form of
probability distributions, aiming to increase the quantity of information
communicated to end users. Although applications of probabilistic prediction
and forecasting with machine learning models in academia and industry are
becoming more frequent, related concepts and methods have not been formalized
and structured under a holistic view of the entire field. Here, we review the
topic of predictive uncertainty estimation with machine learning algorithms, as
well as the related metrics (consistent scoring functions and proper scoring
rules) for assessing probabilistic predictions. The review covers a time period
spanning from the introduction of early statistical (linear regression and time
series models, based on Bayesian statistics or quantile regression) to recent
machine learning algorithms (including generalized additive models for
location, scale and shape, random forests, boosting and deep learning
algorithms) that are more flexible by nature. The review of the progress in the
field, expedites our understanding on how to develop new algorithms tailored to
users' needs, since the latest advancements are based on some fundamental
concepts applied to more complex algorithms. We conclude by classifying the
material and discussing challenges that are becoming a hot topic of research.Comment: 83 pages, 5 figure
Conformal Sensitivity Analysis for Individual Treatment Effects
Estimating an individual treatment effect (ITE) is essential to personalized
decision making. However, existing methods for estimating the ITE often rely on
unconfoundedness, an assumption that is fundamentally untestable with observed
data. To assess the robustness of individual-level causal conclusion with
unconfoundedness, this paper proposes a method for sensitivity analysis of the
ITE, a way to estimate a range of the ITE under unobserved confounding. The
method we develop quantifies unmeasured confounding through a marginal
sensitivity model [Ros2002, Tan2006], and adapts the framework of conformal
inference to estimate an ITE interval at a given confounding strength. In
particular, we formulate this sensitivity analysis problem as a conformal
inference problem under distribution shift, and we extend existing methods of
covariate-shifted conformal inference to this more general setting. The result
is a predictive interval that has guaranteed nominal coverage of the ITE, a
method that provides coverage with distribution-free and nonasymptotic
guarantees. We evaluate the method on synthetic data and illustrate its
application in an observational study.Comment: Journal of the American Statistical Associatio
- …