267 research outputs found
Computational Efficiency in Bayesian Model and Variable Selection
This paper is concerned with the efficient implementation of Bayesian model averaging (BMA) and Bayesian variable selection, when the number of candidate variables and models is large, and estimation of posterior model probabilities must be based on a subset of the models. Efficient implementation is concerned with two issues, the efficiency of the MCMC algorithm itself and efficient computation of the quantities needed to obtain a draw from the MCMC algorithm. For the first aspect, it is desirable that the chain moves well and quickly through the model space and takes draws from regions with high probabilities. In this context there is a natural trade-off between local moves, which make use of the current parameter values to propose plausible values for model parameters, and more global transitions, which potentially allow exploration of the distribution of interest in fewer steps, but where each step is more computationally intensive. We assess the convergence properties of simple samplers based on local moves and some recently proposed algorithms intended to improve on the basic samplers. For the second aspect, efficient computation within the sampler, we focus on the important case of linear models where the computations essentially reduce to least squares calculations. When the chain makes local moves, adding or dropping a variable, substantial gains in efficiency can be made by updating the previous least squares solution.
An Embarrassment of Riches: Forecasting Using Large Panels
The problem of having to select a small subset of predictors from a large number of useful variables can be circumvented nowadays in forecasting. One possibility is to efficiently and systematically evaluate all predictors and almost all possible models that these predictors in combination can give rise to. The idea of combining forecasts from various indicator models by using Bayesian model averaging is explored, and compared to diffusion indexes, another method using large number of predictors to forecast. In addition forecasts based on the median model are considered.
Computational Efficiency in Bayesian Model and Variable Selection
Large scale Bayesian model averaging and variable selection exercises present, despite the great increase in desktop computing power, considerable computational challenges. Due to the large scale it is impossible to evaluate all possible models and estimates of posterior probabilities are instead obtained from stochastic (MCMC) schemes designed to converge on the posterior distribution over the model space. While this frees us from the requirement of evaluating all possible models the computational effort is still substantial and efficient implementation is vital. Efficient implementation is concerned with two issues: the efficiency of the MCMC algorithm itself and efficient computation of the quantities needed to obtain a draw from the MCMC algorithm. We evaluate several different MCMC algorithms and find that relatively simple algorithms with local moves perform competitively except possibly when the data is highly collinear. For the second aspect, efficient computation within the sampler, we focus on the important case of linear models where the computations essentially reduce to least squares calculations. Least squares solvers that update a previous model estimate are appealing when the MCMC algorithm makes local moves and we find that the Cholesky update is both fast and accurate.Bayesian Model Averaging; Sweep operator; Cholesky decomposition; QR decomposition; Swendsen-Wang algorithm
An Embarrassment of Riches: Forecasting Using Large Panels
The increasing availability of data and potential predictor variables poses new challenges to forecasters. The task of formulating a single forecasting model that can extract all the relevant information is becoming increasingly difficult in the face of this abundance of data. The two leading approaches to addressing this "embarrassment of riches" are philosophically distinct. One approach builds forecast models based on summaries of the predictor variables, such as principal components, and the second approach is analogous to forecast combination, where the forecasts from a multitude of possible models are averaged. Using several data sets we compare the performance of the two approaches in the guise of the diffusion index or factor models popularized by Stock and Watson and forecast combination as an application of Bayesian model averaging. We find that none of the methods is uniformly superior and that no method performs better than, or is outperformed by, a simple AR(p) process.Bayesian model averaging; Diffusion indexes; GDP growth rate; Inflation rate
Forecast Combination and Model Averaging using Predictive Measures
We extend the standard approach to Bayesian forecast combination by forming the weights for the model averaged forecast from the predictive likelihood rather than the standard marginal likelihood. The use of predictive measures of fit offers greater protection against in-sample overfitting and improves forecast performance. For the predictive likelihood we show analytically that the forecast weights have good large and small sample properties. This is confirmed in a simulation study and an application to forecasts of the Swedish inflation rate where forecast combination using the predictive likelihood outperforms standard Bayesian model averaging using the marginal likelihood
A Review of Forecasting Techniques for Large Data Sets
This paper provides a review which focuses on forecasting using statistical/econometric methods designed for dealing with large data sets.Macroeconomic forecasting, Factor models, Forecast combination, Principal components
A State Space Approach to Extracting the Signal from Uncertain Data
Most macroeconomic data are uncertain - they are estimates rather than perfect measures of underlying economic variables. One symptom of that uncertainty is the propensity of statistical agencies to revise their estimates in the light of new information or methodological advances. This paper sets out an approach for extracting the signal from uncertain data. It describes a two-step estimation procedure in which the history of past revisions are first used to estimate the parameters of a measurement equation describing the official published estimates. These parameters are then imposed in a maximum likelihood estimation of a state space model for the macroeconomic variable.Real-time data analysis, State space models, Data uncertainty, Data revisions
Toxic metal enrichment and boating intensity: sediment records of antifoulant copper in shallow lakes of eastern England
Tributyltin (TBT), an aqueous biocide derived from antifouling paint pollution, is known to have impacted coastal marine ecosystems, and has been reported in the sediment of the Norfolk and Suffolk Broads, a network of rivers and shallow lakes in eastern England. In the marine environment, the 1987 TBT ban has resulted in expanded use of alternative biocides, raising the question of whether these products too have impacted the Broads ecosystem and freshwaters in general. Here we examine the lake sediment record in the Norfolk and Suffolk Broads for contamination by copper (Cu) (as an active biocide agent) and zinc (Zn) (as a component of booster biocides), to assess their occurrence and potential for causing environmental harm in freshwater ecosystems. We find that, after the introduction of leisure boating, there is a statistically significant difference in Cu enrichment between heavily and lightly boated sites, while no such difference exists prior to this time. At the heavily boated sites the onset of Cu enrichment coincides with a period of rapid increase in leisure boating. Such enrichment is maintained to the present day, with some evidence of continued increase. We conclude that Cu-based antifouling has measurably contaminated lakes exposed to boating, at concentrations high enough to cause ecological harm. Similar findings can be expected at other boated freshwater ecosystems elsewhere in the world
Human Papillomavirus Genotype Distribution in Czech Women and Men with Diseases Etiologically Linked to HPV
The HPV prevalence and genotype distribution are important for the estimation of the impact of HPV-based cervical cancer screening and HPV vaccination on the incidence of diseases etiologically linked to HPVs. The HPV genotype distribution varies across different geographical regions. Therefore, we investigated the type-specific HPV prevalence in Czech women and men with anogenital diseases.We analyzed 157 squamous cell carcinoma samples, 695 precancerous lesion samples and 64 cervical, vulvar and anal condylomata acuminate samples. HPV detection and typing were performed by PCR with GP5+/6+ primers, reverse line blot assay and sequencing. samples. HPV types 6 and/or 11 were detected in 84% samples of condylomata acuminate samples.The prevalence of vaccinal and related HPV types in patients with HPV-associated diseases in the Czech Republic is very high. We may assume that the implementation of routine vaccination against HPV would greatly reduce the burden of HPV-associated diseases in the Czech Republic
- …