2,299 research outputs found
Combining predictions from linear models when training and test inputs differ
Methods for combining predictions from different models in a supervised
learning setting must somehow estimate/predict the quality of a model's
predictions at unknown future inputs. Many of these methods (often implicitly)
make the assumption that the test inputs are identical to the training inputs,
which is seldom reasonable. By failing to take into account that prediction
will generally be harder for test inputs that did not occur in the training
set, this leads to the selection of too complex models. Based on a novel,
unbiased expression for KL divergence, we propose XAIC and its special case
FAIC as versions of AIC intended for prediction that use different degrees of
knowledge of the test inputs. Both methods substantially differ from and may
outperform all the known versions of AIC even when the training and test inputs
are iid, and are especially useful for deterministic inputs and under covariate
shift. Our experiments on linear models suggest that if the test and training
inputs differ substantially, then XAIC and FAIC predictively outperform AIC,
BIC and several other methods including Bayesian model averaging.Comment: 12 pages, 2 figures. To appear in Proceedings of the 30th Conference
on Uncertainty in Artificial Intelligence (UAI2014). This version includes
the supplementary material (regularity assumptions, proofs
Catching-up and inflation in Europe: Balassa-Samuelson, Engelâs Law and other Culprits
This study analyses the impact of economic catching-up on annual inflation rates in the European Union with a special focus on the new member countries of Central and Eastern Europe. Using an array of estimation methods, we show that the Balassa-Samuelson effect is not an important driver of inflation rates. By contrast, we find that the initial price level and regulated prices strongly affect inflation outcomes in a nonlinear manner and that the extension of Engelâs Law may hold during periods of very fast growth. We interpret these results as a sign that price level convergence comes from goods, market and non-makret service prices. Furthermore, we find that the Phillips curve flattens with a decline in the inflation rate, that inflation is more persistant and that commodity prices have a stronger effect on inflation in a higher inflation environment.European Union, inflation, Balassa-Samuelson, real convergence,catching up, Bayesian model average, non-linearity.
Almost the Best of Three Worlds: Risk, Consistency and Optional Stopping for the Switch Criterion in Nested Model Selection
We study the switch distribution, introduced by Van Erven et al. (2012),
applied to model selection and subsequent estimation. While switching was known
to be strongly consistent, here we show that it achieves minimax optimal
parametric risk rates up to a factor when comparing two nested
exponential families, partially confirming a conjecture by Lauritzen (2012) and
Cavanaugh (2012) that switching behaves asymptotically like the Hannan-Quinn
criterion. Moreover, like Bayes factor model selection but unlike standard
significance testing, when one of the models represents a simple hypothesis,
the switch criterion defines a robust null hypothesis test, meaning that its
Type-I error probability can be bounded irrespective of the stopping rule.
Hence, switching is consistent, insensitive to optional stopping and almost
minimax risk optimal, showing that, Yang's (2005) impossibility result
notwithstanding, it is possible to `almost' combine the strengths of AIC and
Bayes factor model selection.Comment: To appear in Statistica Sinic
Why has China grown so fast? The role of physical and human capital formation
Cross-province growth regressions for China are estimated for the reform period. Two research questions are asked. Can the regressions help us to understand why China as a whole has grown so fast? What types of investment matter for China's growth? We address the problem of model uncertainty by adopting two approaches to model selection to consider a wide range of candidate predictors of growth. Starting from the baseline equation, the growth impact of physical and human capital is examined using panel data techniques. Both forms of capital promote economic growth. âInvestment in innovationâ and private investment are found to be particularly important. Secondary school enrolment contributes to growth, and higher education enrolment even more so
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
We empirically show that Bayesian inference can be inconsistent under
misspecification in simple linear regression problems, both in a model
averaging/selection and in a Bayesian ridge regression setting. We use the
standard linear model, which assumes homoskedasticity, whereas the data are
heteroskedastic, and observe that the posterior puts its mass on ever more
high-dimensional models as the sample size increases. To remedy the problem, we
equip the likelihood in Bayes' theorem with an exponent called the learning
rate, and we propose the Safe Bayesian method to learn the learning rate from
the data. SafeBayes tends to select small learning rates as soon the standard
posterior is not `cumulatively concentrated', and its results on our data are
quite encouraging.Comment: 70 pages, 20 figure
Is God in the Details? A Reexamination of the Role of Religion in Economic Growth
Barro and McCleary (2003) is a key research contribution in the new literature exploring the macroeconomic effects of religious beliefs. This paper represents an effort to evaluate the strength of their claims. We evaluate their results in terms of replicability and robustness. Overall, their analysis generally meets the standard of statistical replicability, though not perfectly. On the other hand, we do not find that their results are robust to changes in their baseline statistical specification. When model averaging methods are employed to integrate information across alternative statistical specifications, little evidence survives that religious variables help to predict cross-country income differences.Economic Growth, Religion, Model Uncertainty
Online Learning of k-CNF Boolean Functions
This paper revisits the problem of learning a k-CNF Boolean function from
examples in the context of online learning under the logarithmic loss. In doing
so, we give a Bayesian interpretation to one of Valiant's celebrated PAC
learning algorithms, which we then build upon to derive two efficient, online,
probabilistic, supervised learning algorithms for predicting the output of an
unknown k-CNF Boolean function. We analyze the loss of our methods, and show
that the cumulative log-loss can be upper bounded, ignoring logarithmic
factors, by a polynomial function of the size of each example.Comment: 20 LaTeX pages. 2 Algorithms. Some Theorem
Why has China Grown so Fast? The Role of Structural Change
Can others learn from China's remarkable growth rate? We explore some indirect determinants of Chinas growth success including the degree of openness, institutional change and sectoral change, based on a cross-province dataset. Our methodology is the informal growth regression, which permits the introduction of some explanatory variables that represent the underlying as well as the proximate causes of growth. We first address the problem of model uncertainty by adopting two approaches to model selection, Bayesian Model Averaging and the automated General-to-Specific approach, to consider a wide range of candidate predictors of growth. Then variables flagged as being important by these procedures are used in formulating our models, in which the contribution of factors behind the proximate determinants are examined using panel data system GMM. All three forms of structural change - relative expansion of the trade sector, of the private sector, and of the non-agricultural sector - are found to raise the growth rate. Moreover, structural change in all three dimensions was rapid over the study period. Each change primarily represents an improvement in the efficiency of the economy, moving it towards its production frontier. We conclude that such improvements in productive efficiency have been an important part of the explanation for China's fast growth. --Economic growth,Structural change,Openness,Institutional change,China
Almost the best of three worlds: Risk, consistency and optional stopping for the switch criterion in nested model selection
We study the switch distribution, introduced by van Erven, GrĂŒnwald and De Rooij (2012), applied to model selection and subsequent estimation. While switching was known to be strongly consistent, here we show that it achieves minimax optimal parametric risk rates up to a log log n factor when comparing two nested exponential families, partially confirming a conjecture by Lauritzen (2012) and Cavanaugh (2012) that switching behaves asymptotically like the Hannan-Quinn criterion. Moreover, like Bayes factor model selection, but unlike standard significance testing, when one of the models represents a simple hypothesis, the switch criterion defines a robust null hypothesis test, meaning that its Type-I error probability can be bounded irrespective of the stopping rule. Hence, switching is consistent, insensitive to optional stopping and almost minimax risk optimal, showing that, Yang's (2005) impossibility result notwithstanding, it is possible to `almost' combine the strengths of AIC and Bayes factor model selection
- âŠ