11 research outputs found

    A comparison of learning rate selection methods in generalized Bayesian inference

    Full text link
    Generalized Bayes posterior distributions are formed by putting a fractional power on the likelihood before combining with the prior via Bayes's formula. This fractional power, which is often viewed as a remedy for potential model misspecification bias, is called the learning rate, and a number of data-driven learning rate selection methods have been proposed in the recent literature. Each of these proposals has a different focus, a different target they aim to achieve, which makes them difficult to compare. In this paper, we provide a direct head-to-head comparison of these learning rate selection methods in various misspecified model scenarios, in terms of several relevant metrics, in particular, coverage probability of the generalized Bayes credible regions. In some examples all the methods perform well, while in others the misspecification is too severe to be overcome, but we find that the so-called generalized posterior calibration algorithm tends to outperform the others in terms of credible region coverage probability.Comment: 22 pages, 2 figures, 4 table

    Model-free generalized fiducial inference

    Full text link
    Motivated by the need for the development of safe and reliable methods for uncertainty quantification in machine learning, I propose and develop ideas for a model-free statistical framework for imprecise probabilistic prediction inference. This framework facilitates uncertainty quantification in the form of prediction sets that offer finite sample control of type 1 errors, a property shared with conformal prediction sets, but this new approach also offers more versatile tools for imprecise probabilistic reasoning. Furthermore, I propose and consider the theoretical and empirical properties of a precise probabilistic approximation to the model-free imprecise framework. Approximating a belief/plausibility measure pair by an [optimal in some sense] probability measure in the credal set is a critical resolution needed for the broader adoption of imprecise probabilistic approaches to inference in statistical and machine learning communities. It is largely undetermined in the statistical and machine learning literatures, more generally, how to properly quantify uncertainty in that there is no generally accepted standard of accountability of stated uncertainties. The research I present in this manuscript is aimed at motivating a framework for statistical inference with reliability and accountability as the guiding principles

    Strong validity, consonance, and conformal prediction

    Full text link
    Valid prediction of future observations is an important and challenging problem. The two general approaches for quantifying uncertainty about the future value employ prediction regions and predictive distribution, respectively, with the latter usually considered to be more informative because it performs other prediction-related tasks. Standard notions of validity focus on the former, i.e., coverage probability bounds for prediction regions, but a notion of validity relevant to the other prediction-related tasks performed by the latter is lacking. In this paper, we present a new notion---strong prediction validity---relevant to these more general prediction tasks. We show that strong validity is connected to more familiar notions of coherence, and argue that imprecise probability considerations are required in order to achieve it. We go on to show that strong prediction validity can be achieved by interpreting the conformal prediction output as the contour function of a consonant plausibility function. We also offer an alternative characterization, based on a new nonparametric inferential model construction, wherein the appearance of consonance is more natural, and prove strong prediction validity.Comment: 34 pages, 3 figures, 2 tables. Comments welcome at https://www.researchers.one/article/2020-01-1

    Valid model-free spatial prediction

    Full text link
    Predicting the response at an unobserved location is a fundamental problem in spatial statistics. Given the difficulty in modeling spatial dependence, especially in non-stationary cases, model-based prediction intervals are at risk of misspecification bias that can negatively affect their validity. Here we present a new approach for model-free spatial prediction based on the {\em conformal prediction} machinery. Our key observation is that spatial data can be treated as exactly or approximately exchangeable in a wide range of settings. For example, when the spatial locations are deterministic, we prove that the response values are, in a certain sense, locally approximately exchangeable for a broad class of spatial processes, and we develop a local spatial conformal prediction algorithm that yields valid prediction intervals without model assumptions. Numerical examples with both real and simulated data confirm that the proposed conformal prediction intervals are valid and generally more efficient than existing model-based procedures across a range of non-stationary and non-Gaussian settings.Comment: 30 pages, 9 figures, 3 tables. Comments welcome at https://www.researchers.one/article/2020-06-1

    Coverage Probability Fails to Ensure Reliable Inference

    Get PDF
    Satellite conjunction analysis is the assessment of collision risk during a close encounter between a satellite and another object in orbit. A counterintuitive phenomenon has emerged in the conjunction analysis literature, namely, probability dilution, in which lower quality data paradoxically appear to reduce the risk of collision. We show that probability dilution is a symptom of a fundamental deficiency in probabilistic representations of statistical inference, in which there are propositions that will consistently be assigned a high degree of belief, regardless of whether or not they are true. We call this deficiency false confidence. In satellite conjunction analysis, it results in a severe and persistent underestimate of collision risk exposure. We introduce the Martin--Liu validity criterion as a benchmark by which to identify statistical methods that are free from false confidence. Such inferences will necessarily be non-probabilistic. In satellite conjunction analysis, we show that uncertainty ellipsoids satisfy the validity criterion. Performing collision avoidance maneuvers based on ellipsoid overlap will ensure that collision risk is capped at the user-specified level. Further, this investigation into satellite conjunction analysis provides a template for recognizing and resolving false confidence issues as they occur in other problems of statistical inference.Comment: 18 pages, 3 figure

    Gibbs posterior concentration rates under sub-exponential type losses

    Full text link
    Bayesian posterior distributions are widely used for inference, but their dependence on a statistical model creates some challenges. In particular, there may be lots of nuisance parameters that require prior distributions and posterior computations, plus a potentially serious risk of model misspecification bias. Gibbs posterior distributions, on the other hand, offer direct, principled, probabilistic inference on quantities of interest through a loss function, not a model-based likelihood. Here we provide simple sufficient conditions for establishing Gibbs posterior concentration rates when the loss function is of a sub-exponential type. We apply these general results in a range of practically relevant examples, including mean regression, quantile regression, and sparse high-dimensional classification. We also apply these techniques in an important problem in medical statistics, namely, estimation of a personalized minimum clinically important difference.Comment: 60 pages, 1 figur
    corecore