Search CORE

3,463 research outputs found

Bayesian Model Selection in Complex Linear Systems, as Illustrated in Genetic Association Studies

Author: Carvalho
Dawid
DiCiccio
Dimas
Ding
Flutre
Fridley
Guan
Howie
Johnson
Johnson
Kass
Liang
Mitchell
Peng
Raftery
Rothman
Schwarz
Scott-Boyer
Servin
Stephens
Tibshirani
Tibshirani
Veyrieras
Wakefield
Wen
Wilson
Wu
Xu
Yuan
Publication venue: 'Wiley'
Publication date: 03/09/2013
Field of study

Motivated by examples from genetic association studies, this paper considers the model selection problem in a general complex linear model system and in a Bayesian framework. We discuss formulating model selection problems and incorporating context-dependent {\it a priori} information through different levels of prior specifications. We also derive analytic Bayes factors and their approximations to facilitate model selection and discuss their theoretical and computational properties. We demonstrate our Bayesian approach based on an implemented Markov Chain Monte Carlo (MCMC) algorithm in simulations and a real data application of mapping tissue-specific eQTLs. Our novel results on Bayes factors provide a general framework to perform efficient model comparisons in complex linear model systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

PubMed Central

Deep Blue Documents at the University of Michigan

Variable Screening for High Dimensional Time Series

Author: Yousuf Kashif
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2018
Field of study

Variable selection is a widely studied problem in high dimensional statistics, primarily since estimating the precise relationship between the covariates and the response is of great importance in many scientific disciplines. However, most of theory and methods developed towards this goal for the linear model invoke the assumption of iid sub-Gaussian covariates and errors. This paper analyzes the theoretical properties of Sure Independence Screening (SIS) (Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849-911]) for high dimensional linear models with dependent and/or heavy tailed covariates and errors. We also introduce a generalized least squares screening (GLSS) procedure which utilizes the serial correlation present in the data. By utilizing this serial correlation when estimating our marginal effects, GLSS is shown to outperform SIS in many cases. For both procedures we prove sure screening properties, which depend on the moment conditions, and the strength of dependence in the error and covariate processes, amongst other factors. Additionally, combining these screening procedures with the adaptive Lasso is analyzed. Dependence is quantified by functional dependence measures (Wu [Proc. Natl. Acad. Sci. USA 102 (2005) 14150-14154]), and the results rely on the use of Nagaev-type and exponential inequalities for dependent random variables. We also conduct simulations to demonstrate the finite sample performance of these procedures, and include a real data application of forecasting the US inflation rate.Comment: Published in the Electronic Journal of Statistics (https://projecteuclid.org/euclid.ejs/1519700498

arXiv.org e-Print Archive

Crossref

Extensions of stability selection using subsamples of observations and covariates

Author: Beinrucker Andre
Blanchard Gilles
Dogan Ürün
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/08/2015
Field of study

We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and B\"uhlmann (J R Stat Soc 72:417-473, 2010). We propose to apply a base selection method repeatedly to random observation subsamples and covariate subsets under scrutiny, and to select covariates based on their selection frequency. We analyse the effects and benefits of these extensions. Our analysis generalizes the theoretical results of Meinshausen and B\"uhlmann (J R Stat Soc 72:417-473, 2010) from the case of half-samples to subsamples of arbitrary size. We study, in a theoretical manner, the effect of taking random covariate subsets using a simplified score model. Finally we validate these extensions on numerical experiments on both synthetic and real datasets, and compare the obtained results in detail to the original stability selection method.Comment: accepted for publication in Statistics and Computin

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot