18,000 research outputs found

    Robust Bayesian Variable Selection for Gene-Environment Interactions

    Get PDF
    Gene-environment (G×E) interactions have important implications to elucidate the etiology of complex diseases beyond the main genetic and environmental effects. Outliers and data contamination in disease phenotypes of G×E studies have been commonly encountered, leading to the development of a broad spectrum of robust penalization methods. Nevertheless, within the Bayesian framework, the issue has not been taken care of in existing studies. We develop a robust Bayesian variable selection method for G×E interaction studies. The proposed Bayesian method can effectively accommodate heavy-tailed errors and outliers in the response variable while conducting variable selection by accounting for structural sparsity. In particular, the spike-and-slab priors have been imposed on both individual and group levels to identify important main and interaction effects. An efïŹcient Gibbs sampler has been developed to facilitate fast computation. The Markov chain Monte Carlo algorithms of the proposed and alternative methods are efïŹciently implemented in C++

    Interaction Analysis of Repeated Measure Data

    Get PDF
    Extensive penalized variable selection methods have been developed in the past two decades for analyzing high dimensional omics data, such as gene expressions, single nucleotide polymorphisms (SNPs), copy number variations (CNVs) and others. However, lipidomics data have been rarely investigated by using high dimensional variable selection methods. This package incorporates our recently developed penalization procedures to conduct interaction analysis for high dimensional lipidomics data with repeated measurements. The core module of this package is developed in C++. The development of this software package and the associated statistical methods have been partially supported by an Innovative Research Award from Johnson Cancer Research Center, Kansas State University

    The Bayesian Regularized Quantile Varying Coefficient Model

    Full text link
    The quantile varying coefficient (VC) model can flexibly capture dynamical patterns of regression coefficients. In addition, due to the quantile check loss function, it is robust against outliers and heavy-tailed distributions of the response variable, and can provide a more comprehensive picture of modeling via exploring the conditional quantiles of the response variable. Although extensive studies have been conducted to examine variable selection for the high-dimensional quantile varying coefficient models, the Bayesian analysis has been rarely developed. The Bayesian regularized quantile varying coefficient model has been proposed to incorporate robustness against data heterogeneity while accommodating the non-linear interactions between the effect modifier and predictors. Selecting important varying coefficients can be achieved through Bayesian variable selection. Incorporating the multivariate spike-and-slab priors further improves performance by inducing exact sparsity. The Gibbs sampler has been derived to conduct efficient posterior inference of the sparse Bayesian quantile VC model through Markov chain Monte Carlo (MCMC). The merit of the proposed model in selection and estimation accuracy over the alternatives has been systematically investigated in simulation under specific quantile levels and multiple heavy-tailed model errors. In the case study, the proposed model leads to identification of biologically sensible markers in a non-linear gene-environment interaction study using the NHS data

    An information-flow-based model with dissipation, saturation and direction for active pathway inference

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biological systems process the genetic information and environmental signals through pathways. How to map the pathways systematically and efficiently from high-throughput genomic and proteomic data is a challenging open problem. Previous methods design different heuristics but do not describe explicitly the behaviours of the information flow.</p> <p>Results</p> <p>In this study, we propose new concepts of dissipation, saturation and direction to decipher the information flow behaviours in the pathways and thereby infer the biological pathways from a given source to its target. This model takes into account explicitly the common features of the information transmission and provides a general framework to model the biological pathways. It can incorporate different types of bio-molecular interactions to infer the signal transduction pathways and interpret the expression quantitative trait loci (eQTL) associations. The model is formulated as a linear programming problem and thus is solved efficiently. Experiments on the real data of yeast indicate that the reproduced pathways are highly consistent with the current knowledge.</p> <p>Conclusions</p> <p>Our model explicitly treats the biological pathways as information flows with dissipation, saturation and direction. The effective applications suggest that the three new concepts may be valid to describe the organization rules of biological pathways. The deduced linear programming should be a promising tool to infer the various biological pathways from the high-throughput data.</p

    Sparse Group Variable Selection for Gene-Environment Interactions in the Longitudinal Stud

    Get PDF
    Recently, regularized variable selection has emerged as a powerful tool to iden- tify and dissect gene-environment interactions. Nevertheless, in longitudinal studies with high di- mensional genetic factors, regularization methods for G×E interactions have not been systemati- cally developed. In this package, we provide the implementation of sparse group variable selec- tion, based on both the quadratic inference function (QIF) and generalized estimating equa- tion (GEE), to accommodate the bi-level selection for longitudinal G×E studies with high dimen- sional genomic features. Alternative methods conducting only the group or individual level se- lection have also been included. The core modules of the package have been developed in C++
    • 

    corecore