7 research outputs found

    Bayesian model averaging with fixed and flexible priors: theory, concepts, and calibration experiments for rainfall-runoff modeling

    Get PDF
    This paper introduces for the first time the concept of Bayesian Model Averaging (BMA) with multiple prior structures, for rainfall‐runoff modeling applications. The original BMA model proposed by Raftery et al. (2005) assumes that the prior probability density function (pdf) is adequately described by a mixture of Gamma and Gaussian distributions. Here we discuss the advantages of using BMA with fixed and flexible prior distributions. Uniform, Binomial, Binomial‐Beta, Benchmark, and Global Empirical Bayes priors along with Informative Prior Inclusion and Combined Prior Probabilities were applied to calibrate daily streamflow records of a coastal plain watershed in the South‐East USA. Various specifications for Zellner's g prior including Hyper, Fixed, and Empirical Bayes Local (EBL) g priors were also employed to account for the sensitivity of BMA and derive the conditional pdf of each constituent ensemble member. These priors were examined using the simulation results of conceptual and semi‐distributed rainfall‐runoff models. The hydrologic simulations were first coupled with a new sensitivity analysis model and a parameter uncertainty algorithm to assess the sensitivity and uncertainty associated with each model. BMA was then used to subsequently combine the simulations of the posterior pdf of each constituent hydrological model. Analysis suggests that a BMA based on combined fixed and flexible priors provides a coherent mechanism and promising results for calculating a weighted posterior probability compared to individual model calibration. Furthermore, the probability of Uniform and Informative Prior Inclusion priors received significantly lower predictive error whereas more uncertainty resulted from a fixed g prior (i.e. EBL)

    Exact Markov chain Monte Carlo and Bayesian linear regression

    Get PDF
    In this work we investigate the use of perfect sampling methods within the context of Bayesian linear regression. We focus on inference problems related to the marginal posterior model probabilities. Model averaged inference for the response and Bayesian variable selection are considered. Perfect sampling is an alternate form of Markov chain Monte Carlo that generates exact sample points from the posterior of interest. This approach removes the need for burn-in assessment faced by traditional MCMC methods. For model averaged inference, we find the monotone Gibbs coupling from the past (CFTP) algorithm is the preferred choice. This requires the predictor matrix be orthogonal, preventing variable selection, but allowing model averaging for prediction of the response. Exploring choices of priors for the parameters in the Bayesian linear model, we investigate sufficiency for monotonicity assuming Gaussian errors. We discover that a number of other sufficient conditions exist, besides an orthogonal predictor matrix, for the construction of a monotone Gibbs Markov chain. Requiring an orthogonal predictor matrix, we investigate new methods of orthogonalizing the original predictor matrix. We find that a new method using the modified Gram-Schmidt orthogonalization procedure performs comparably with existing transformation methods, such as generalized principal components. Accounting for the effect of using an orthogonal predictor matrix, we discover that inference using model averaging for in-sample prediction of the response is comparable between the original and orthogonal predictor matrix. The Gibbs sampler is then investigated for sampling when using the original predictor matrix and the orthogonal predictor matrix. We find that a hybrid method, using a standard Gibbs sampler on the orthogonal space in conjunction with the monotone CFTP Gibbs sampler, provides the fastest computation and convergence to the posterior distribution. We conclude the hybrid approach should be used when the monotone Gibbs CFTP sampler becomes impractical, due to large backwards coupling times. We demonstrate large backwards coupling times occur when the sample size is close to the number of predictors, or when hyper-parameter choices increase model competition. The monotone Gibbs CFTP sampler should be taken advantage of when the backwards coupling time is small. For the problem of variable selection we turn to the exact version of the independent Metropolis-Hastings (IMH) algorithm. We reiterate the notion that the exact IMH sampler is redundant, being a needlessly complicated rejection sampler. We then determine a rejection sampler is feasible for variable selection when the sample size is close to the number of predictors and using Zellner’s prior with a small value for the hyper-parameter c. Finally, we use the example of simulating from the posterior of c conditional on a model to demonstrate how the use of an exact IMH view-point clarifies how the rejection sampler can be adapted to improve efficiency

    Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches

    Get PDF
    Includes bibliographical referencesThis thesis explores uncertainty statistics to model agricultural crop yields, in a situation where there are neither sampling observations nor historical record. The Bayesian approach to a linear regression model is useful for predict ion of crop yield when there are quantity data issue s and the model structure uncertainty and the regression model involves a large number of explanatory variables. Data quantity issues might occur when a farmer is cultivating a new crop variety, moving to a new farming location or when introducing a new farming technology, where the situation may warrant a change in the current farming practice. The first part of this thesis involved the collection of data from experts' domain and the elicitation of the probability distributions. Uncertainty statistics, the foundation of uncertainty theory and the data gathering procedures were discussed in detail. We proposed an estimation procedure for the estimation of uncertainty distributions. The procedure was then implemented on agricultural data to fit some uncertainty distributions to five cereal crop yields. A Delphi method was introduced and used to fit uncertainty distributions for multiple experts' data of sesame seed yield. The thesis defined an uncertainty distance and derived a distance for a difference between two uncertainty distributions. We lastly estimated the distance between a hypothesized distribution and an uncertainty normal distribution. Although, the applicability of uncertainty statistics is limited to one sample model, the approach provides a fast approach to establish a standard for process parameters. Where no sampling observation exists or it is very expensive to acquire, the approach provides an opportunity to engage experts and come up with a model for guiding decision making. In the second part, we fitted a full dataset obtained from an agricultural survey of small-scale farmers to a linear regression model using direct Markov Chain Monte Carlo (MCMC), Bayesian estimation (with uniform prior) and maximum likelihood estimation (MLE) method. The results obtained from the three procedures yielded similar mean estimates, but the credible intervals were found to be narrower in Bayesian estimates than confidence intervals in MLE method. The predictive outcome of the estimated model was then assessed using simulated data for a set of covariates. Furthermore, the dataset was then randomly split into two data sets. The informative prior was later estimated from one-half called the "old data" using Ordinary Least Squares (OLS) method. Three models were then fitted onto the second half called the "new data": General Linear Model (GLM) (M1), Bayesian model with a non-informative prior (M2) and Bayesian model with informative prior (M3). A leave-one-outcross validation (LOOCV) method was used to compare the predictive performance of these models. It was found that the Bayesian models showed better predictive performance than M1. M3 (with a prior) had moderate average Cross Validation (CV) error and Cross Validation (CV) standard error. GLM performed worst with least average CV error and highest (CV) standard error among the models. In Model M3 (expert prior), the predictor variables were found to be significant at 95% credible intervals. In contrast, most variables were not significant under models M1 and M2. Also, The model with informative prior had narrower credible intervals compared to the non-information prior and GLM model. The results indicated that variability and uncertainty in the data was reasonably reduced due to the incorporation of expert prior / information prior. We lastly investigated the residual plots of these models to assess their prediction performance. Bayesian Model Average (BMA) was later introduced to address the issue of model structure uncertainty of a single model. BMA allows the computation of weighted average over possible model combinations of predictors. An approximate AIC weight was then proposed for model selection instead of frequentist alternative hypothesis testing (or models comparison in a set of competing candidate models). The method is flexible and easy to interpret instead of raw AIC or Bayesian information criterion (BIC), which approximates the Bayes factor. Zellner's g-prior was considered appropriate as it has widely been used in linear models. It preserves the correlation structure among predictors in its prior covariance. The method also yields closed-form marginal likelihoods which lead to huge computational savings by avoiding sampling in the parameter space as in BMA. We lastly determined a single optimal model from all possible combination of models and also computed the log-likelihood of each model

    Bayesian model selection of structural explanatory models: Application to road accident data

    Get PDF
    Using the Bayesian approach as the model selection criteria, the main purpose in this study is to establish a practical road accident model that can provide a better interpretation and prediction performance. For this purpose we are using a structural explanatory model with autoregressive error term. The model estimation is carried out through Bayesian inference and the best model is selected based on the goodness of fit measures. To cross validate the model estimation further prediction analysis were done. As the road safety measures the number of fatal accidents in Spain, during 2000-2011 were employed. The results of the variable selection process show that the factors explaining fatal road accidents are mainly exposure, economic factors, and surveillance and legislative measures. The model selection shows that the impact of economic factors on fatal accidents during the period under study has been higher compared to surveillance and legislative measures

    Model uncertainty and systematic risk in US banking

    Get PDF
    This paper uses Bayesian Model Averaging to examine the driving factors of equity returns of US Bank Holding Companies. BMA has as an advantage over OLS that it accounts for the considerable uncertainty about the correct set (model) of bank risk factors. We find that out of a broad set of 12 risk factors only the market, real estate, and high-minus-low Fama–French factors are reliably related to US bank stock returns over the period 1986–2010. Other factors are either only relevant over specific subperiods or for subsets of bank holding companies. We discuss the implications of our findings for empirical banking research

    The Determinants of Economic Growth in European Regions

    Get PDF
    We use Bayesian Model Averaging (BMA) to evaluate the robustness of determinants of economic growth in a new dataset of 255 European regions in the 1995-2005 period. We use three different specifications based on (1) the cross-section of regions, (2) the cross-section of regions with country fixed effects and (3) the cross-section of regions with a spatial autoregressive (SAR) structure. We investigate the existence of parameter heterogeneity by allowing for interactions of potential explanatory variables with geographical dummies as extra regressors. We find remarkable differences between the determinants of economic growth implied by differences between regions and those within regions of a given country. In the cross-section of regions, we find evidence for conditional convergence with speed around two percent. The convergence process between countries is dominated by the catching up process of regions in Central and Eastern Europe (CEE), whereas convergence within countries is mostly a characteristic of regions in old EU member states. We also find robust evidence of positive growth of capital cities, a highly educated workforce and a negative effect of population density.model uncertainty, spatial autoregressive model, determinants of economic growth, European regions

    The econometrics of structural change: statistical analysis and forecasting in the context of the South African economy

    Get PDF
    Philosophiae Doctor - PhDOne of the assumptions of conventional regression analysis is I that the parameters are constant over all observations. It has often been suggested that this may not be a valid assumption to make, particularly if the econometric model is to be used for economic forecasting0 Apart from this it is also found that econometric models, in particular, are used to investigate the underlying interrelationships of the system under consideration in order to understand and to explain relevant phenomena in structural analysis. The pre-requisite of such use of econometrics is that the regression parameters of the model is assumed to be constant over time or across different crosssectional units
    corecore