Search CORE

168,472 research outputs found

Uncertainty Intervals for Prediction Errors in Time Series Forecasting

Author: Bates Stephen
Mei Song
Taylor Jonathan
Tibshirani Robert
Xu Hui
Publication venue
Publication date: 14/09/2023
Field of study

Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-validation (FCV) for obtaining point estimators and constructing confidence intervals based on the Central Limit Theorem (CLT). The naive version assumes independence, a condition that is usually invalid due to time correlation. These approaches lack statistical interpretations and theoretical justifications even under stationarity. This paper systematically investigates uncertainty intervals for prediction errors in time series forecasting. We first distinguish two key inferential targets: the stochastic test error over near future data points, and the expected test error as the expectation of the former. The stochastic test error is often more relevant in applications needing to quantify uncertainty over individual time series instances. To construct prediction intervals for the stochastic test error, we propose the quantile-based forward cross-validation (QFCV) method. Under an ergodicity assumption, QFCV intervals have asymptotically valid coverage and are shorter than marginal empirical quantiles. In addition, we also illustrate why naive CLT-based FCV intervals fail to provide valid uncertainty intervals, even with certain corrections. For non-stationary time series, we further provide rolling intervals by combining QFCV with adaptive conformal prediction to give time-average coverage guarantees. Overall, we advocate the use of QFCV procedures and demonstrate their coverage and efficiency through simulations and real data examples.Comment: 35 pages, 17 figure

arXiv.org e-Print Archive

The out-of-sample $R^2$ : estimation and inference

Author: Hawinkel Stijn
Maere Steven
Waegeman Willem
Publication venue: 'Informa UK Limited'
Publication date: 10/02/2023
Field of study

Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample performance is commonly estimated using data splitting algorithms such as cross-validation or the bootstrap. For quantitative outcomes, the ratio of variance explained to total variance can be summarized by the coefficient of determination or in-sample

R^2

, which is easy to interpret and to compare across different outcome variables. As opposed to the in-sample

R^2

, the out-of-sample

R^2

has not been well defined and the variability on the out-of-sample

\hat{R}^2

has been largely ignored. Usually only its point estimate is reported, hampering formal comparison of predictability of different outcome variables. Here we explicitly define the out-of-sample

R^2

as a comparison of two predictive models, provide an unbiased estimator and exploit recent theoretical advances on uncertainty of data splitting estimates to provide a standard error for the

\hat{R}^2

. The performance of the estimators for the

R^2

and its standard error are investigated in a simulation study. We demonstrate our new method by constructing confidence intervals and comparing models for prediction of quantitative

\text{Brassica napus}

and

\text{Zea mays}

phenotypes based on gene expression data

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Calibration and improved prediction of computer models by universal Kriging

Author: Bachoc François
Bois Guillaume
Garnier Josselin
Martinez Jean-Marc
Publication venue
Publication date: 26/02/2013
Field of study

This paper addresses the use of experimental data for calibrating a computer model and improving its predictions of the underlying physical system. A global statistical approach is proposed in which the bias between the computer model and the physical system is modeled as a realization of a Gaussian process. The application of classical statistical inference to this statistical model yields a rigorous method for calibrating the computer model and for adding to its predictions a statistical correction based on experimental data. This statistical correction can substantially improve the calibrated computer model for predicting the physical system on new experimental conditions. Furthermore, a quantification of the uncertainty of this prediction is provided. Physical expertise on the calibration parameters can also be taken into account in a Bayesian framework. Finally, the method is applied to the thermal-hydraulic code FLICA 4, in a single phase friction model framework. It allows to improve the predictions of the thermal-hydraulic code FLICA 4 significantly

arXiv.org e-Print Archive

Crossref

HAL-CEA

Hal-Diderot