2 research outputs found
Integrating Uncertainty Awareness into Conformalized Quantile Regression
Conformalized Quantile Regression (CQR) is a recently proposed method for
constructing prediction intervals for a response given covariates ,
without making distributional assumptions. However, existing constructions of
CQR can be ineffective for problems where the quantile regressors perform
better in certain parts of the feature space than others. The reason is that
the prediction intervals of CQR do not distinguish between two forms of
uncertainty: first, the variability of the conditional distribution of
given (i.e., aleatoric uncertainty), and second, our uncertainty in
estimating this conditional distribution (i.e., epistemic uncertainty). This
can lead to intervals that are overly narrow in regions where epistemic
uncertainty is high. To address this, we propose a new variant of the CQR
methodology, Uncertainty-Aware CQR (UACQR), that explicitly separates these two
sources of uncertainty to adjust quantile regressors differentially across the
feature space. Compared to CQR, our methods enjoy the same distribution-free
theoretical coverage guarantees, while demonstrating in our experiments
stronger conditional coverage properties in simulated settings and real-world
data sets alike
Beyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal Forecasting
Producing high-quality forecasts of key climate variables such as temperature
and precipitation on subseasonal time scales has long been a gap in operational
forecasting. Recent studies have shown promising results using machine learning
(ML) models to advance subseasonal forecasting (SSF), but several open
questions remain. First, several past approaches use the average of an ensemble
of physics-based forecasts as an input feature of these models. However,
ensemble forecasts contain information that can aid prediction beyond only the
ensemble mean. Second, past methods have focused on average performance,
whereas forecasts of extreme events are far more important for planning and
mitigation purposes. Third, climate forecasts correspond to a spatially-varying
collection of forecasts, and different methods account for spatial variability
in the response differently. Trade-offs between different approaches may be
mitigated with model stacking. This paper describes the application of a
variety of ML methods used to predict monthly average precipitation and two
meter temperature using physics-based predictions (ensemble forecasts) and
observational data such as relative humidity, pressure at sea level, or
geopotential height, two weeks in advance for the whole continental United
States. Regression, quantile regression, and tercile classification tasks using
linear models, random forests, convolutional neural networks, and stacked
models are considered. The proposed models outperform common baselines such as
historical averages (or quantiles) and ensemble averages (or quantiles). This
paper further includes an investigation of feature importance, trade-offs
between using the full ensemble or only the ensemble average, and different
modes of accounting for spatial variability