268 research outputs found
Bayesian Model Selection in Complex Linear Systems, as Illustrated in Genetic Association Studies
Motivated by examples from genetic association studies, this paper considers
the model selection problem in a general complex linear model system and in a
Bayesian framework. We discuss formulating model selection problems and
incorporating context-dependent {\it a priori} information through different
levels of prior specifications. We also derive analytic Bayes factors and their
approximations to facilitate model selection and discuss their theoretical and
computational properties. We demonstrate our Bayesian approach based on an
implemented Markov Chain Monte Carlo (MCMC) algorithm in simulations and a real
data application of mapping tissue-specific eQTLs. Our novel results on Bayes
factors provide a general framework to perform efficient model comparisons in
complex linear model systems
Inference on Treatment Effects After Selection Amongst High-Dimensional Controls
We propose robust methods for inference on the effect of a treatment variable
on a scalar outcome in the presence of very many controls. Our setting is a
partially linear model with possibly non-Gaussian and heteroscedastic
disturbances. Our analysis allows the number of controls to be much larger than
the sample size. To make informative inference feasible, we require the model
to be approximately sparse; that is, we require that the effect of confounding
factors can be controlled for up to a small approximation error by conditioning
on a relatively small number of controls whose identities are unknown. The
latter condition makes it possible to estimate the treatment effect by
selecting approximately the right set of controls. We develop a novel
estimation and uniformly valid inference method for the treatment effect in
this setting, called the "post-double-selection" method. Our results apply to
Lasso-type methods used for covariate selection as well as to any other model
selection method that is able to find a sparse model with good approximation
properties.
The main attractive feature of our method is that it allows for imperfect
selection of the controls and provides confidence intervals that are valid
uniformly across a large class of models. In contrast, standard post-model
selection estimators fail to provide uniform inference even in simple cases
with a small, fixed number of controls. Thus our method resolves the problem of
uniform inference after model selection for a large, interesting class of
models. We illustrate the use of the developed methods with numerical
simulations and an application to the effect of abortion on crime rates
Recommended from our members
Regional Heterogeneity and U.S. Presidential Elections
This paper develops a recursive model of voter turnout and voting outcomes at the U.S. county level to investigate the socioeconomic determinants of recent U.S. presidential elections. It exploits cross-section variations across U.S. counties and investi
- …