11 research outputs found
A comparison of learning rate selection methods in generalized Bayesian inference
Generalized Bayes posterior distributions are formed by putting a fractional
power on the likelihood before combining with the prior via Bayes's formula.
This fractional power, which is often viewed as a remedy for potential model
misspecification bias, is called the learning rate, and a number of data-driven
learning rate selection methods have been proposed in the recent literature.
Each of these proposals has a different focus, a different target they aim to
achieve, which makes them difficult to compare. In this paper, we provide a
direct head-to-head comparison of these learning rate selection methods in
various misspecified model scenarios, in terms of several relevant metrics, in
particular, coverage probability of the generalized Bayes credible regions. In
some examples all the methods perform well, while in others the
misspecification is too severe to be overcome, but we find that the so-called
generalized posterior calibration algorithm tends to outperform the others in
terms of credible region coverage probability.Comment: 22 pages, 2 figures, 4 table
Model-free generalized fiducial inference
Motivated by the need for the development of safe and reliable methods for
uncertainty quantification in machine learning, I propose and develop ideas for
a model-free statistical framework for imprecise probabilistic prediction
inference. This framework facilitates uncertainty quantification in the form of
prediction sets that offer finite sample control of type 1 errors, a property
shared with conformal prediction sets, but this new approach also offers more
versatile tools for imprecise probabilistic reasoning. Furthermore, I propose
and consider the theoretical and empirical properties of a precise
probabilistic approximation to the model-free imprecise framework.
Approximating a belief/plausibility measure pair by an [optimal in some sense]
probability measure in the credal set is a critical resolution needed for the
broader adoption of imprecise probabilistic approaches to inference in
statistical and machine learning communities. It is largely undetermined in the
statistical and machine learning literatures, more generally, how to properly
quantify uncertainty in that there is no generally accepted standard of
accountability of stated uncertainties. The research I present in this
manuscript is aimed at motivating a framework for statistical inference with
reliability and accountability as the guiding principles
Strong validity, consonance, and conformal prediction
Valid prediction of future observations is an important and challenging
problem. The two general approaches for quantifying uncertainty about the
future value employ prediction regions and predictive distribution,
respectively, with the latter usually considered to be more informative because
it performs other prediction-related tasks. Standard notions of validity focus
on the former, i.e., coverage probability bounds for prediction regions, but a
notion of validity relevant to the other prediction-related tasks performed by
the latter is lacking. In this paper, we present a new notion---strong
prediction validity---relevant to these more general prediction tasks. We show
that strong validity is connected to more familiar notions of coherence, and
argue that imprecise probability considerations are required in order to
achieve it. We go on to show that strong prediction validity can be achieved by
interpreting the conformal prediction output as the contour function of a
consonant plausibility function. We also offer an alternative characterization,
based on a new nonparametric inferential model construction, wherein the
appearance of consonance is more natural, and prove strong prediction validity.Comment: 34 pages, 3 figures, 2 tables. Comments welcome at
https://www.researchers.one/article/2020-01-1
Valid model-free spatial prediction
Predicting the response at an unobserved location is a fundamental problem in
spatial statistics. Given the difficulty in modeling spatial dependence,
especially in non-stationary cases, model-based prediction intervals are at
risk of misspecification bias that can negatively affect their validity. Here
we present a new approach for model-free spatial prediction based on the {\em
conformal prediction} machinery. Our key observation is that spatial data can
be treated as exactly or approximately exchangeable in a wide range of
settings. For example, when the spatial locations are deterministic, we prove
that the response values are, in a certain sense, locally approximately
exchangeable for a broad class of spatial processes, and we develop a local
spatial conformal prediction algorithm that yields valid prediction intervals
without model assumptions. Numerical examples with both real and simulated data
confirm that the proposed conformal prediction intervals are valid and
generally more efficient than existing model-based procedures across a range of
non-stationary and non-Gaussian settings.Comment: 30 pages, 9 figures, 3 tables. Comments welcome at
https://www.researchers.one/article/2020-06-1
Coverage Probability Fails to Ensure Reliable Inference
Satellite conjunction analysis is the assessment of collision risk during a
close encounter between a satellite and another object in orbit. A
counterintuitive phenomenon has emerged in the conjunction analysis literature,
namely, probability dilution, in which lower quality data paradoxically appear
to reduce the risk of collision. We show that probability dilution is a symptom
of a fundamental deficiency in probabilistic representations of statistical
inference, in which there are propositions that will consistently be assigned a
high degree of belief, regardless of whether or not they are true. We call this
deficiency false confidence. In satellite conjunction analysis, it results in a
severe and persistent underestimate of collision risk exposure.
We introduce the Martin--Liu validity criterion as a benchmark by which to
identify statistical methods that are free from false confidence. Such
inferences will necessarily be non-probabilistic. In satellite conjunction
analysis, we show that uncertainty ellipsoids satisfy the validity criterion.
Performing collision avoidance maneuvers based on ellipsoid overlap will ensure
that collision risk is capped at the user-specified level. Further, this
investigation into satellite conjunction analysis provides a template for
recognizing and resolving false confidence issues as they occur in other
problems of statistical inference.Comment: 18 pages, 3 figure
Gibbs posterior concentration rates under sub-exponential type losses
Bayesian posterior distributions are widely used for inference, but their
dependence on a statistical model creates some challenges. In particular, there
may be lots of nuisance parameters that require prior distributions and
posterior computations, plus a potentially serious risk of model
misspecification bias. Gibbs posterior distributions, on the other hand, offer
direct, principled, probabilistic inference on quantities of interest through a
loss function, not a model-based likelihood. Here we provide simple sufficient
conditions for establishing Gibbs posterior concentration rates when the loss
function is of a sub-exponential type. We apply these general results in a
range of practically relevant examples, including mean regression, quantile
regression, and sparse high-dimensional classification. We also apply these
techniques in an important problem in medical statistics, namely, estimation of
a personalized minimum clinically important difference.Comment: 60 pages, 1 figur