119,304 research outputs found

    Application of an iterative Bayesian variable selection method in a genome-wide association study of rheumatoid arthritis

    Get PDF
    Genome-wide association studies usually involve several hundred thousand of single-nucleotide polymorphisms (SNPs). Conventional approaches face challenges when there are enormous number of SNPs but a relatively small number of samples and, in some cases, are not feasible. We introduce here an iterative Bayesian variable selection method that provides a unique tool for association studies with a large number of SNPs (p) but a relatively small sample size (n). We applied this method to the simulated case-control sample provided by the Genetic Analysis Workshop 15 and compared its performance with stepwise variable selection method. We demonstrated that the results of iterative Bayesian variable selection applied to when p » n are as comparable as those of stepwise variable selection implemented to when n » p. When n > p, the iterative Bayesian variable selection performs better than stepwise variable selection does

    Application of Bayesian classification with singular value decomposition method in genome-wide association studies

    Get PDF
    To analyze multiple single-nucleotide polymorphisms simultaneously when the number of markers is much larger than the number of studied individuals, as is the situation we have in genome-wide association studies (GWAS), we developed the iterative Bayesian variable selection method and successfully applied it to the simulated rheumatoid arthritis data provided by the Genetic Analysis Workshop 15 (GAW15). One drawback for applying our iterative Bayesian variable selection method is the relatively long running time required for evaluation of GWAS data. To improve computing speed, we recently developed a Bayesian classification with singular value decomposition (BCSVD) method. We have applied the BCSVD method here to the rheumatoid arthritis data distributed by GAW16 Problem 1 and demonstrated that the BCSVD method works well for analyzing GWAS data

    Biological and Practical Implications of Genome-Wide Association Study of Schizophrenia Using Bayesian Variable Selection

    Get PDF
    Genome-wide association studies (GWAS) have identified over 100 loci associated with schizophrenia. Most of these studies test genetic variants for association one at a time. In this study, we performed GWAS of the molecular genetics of schizophrenia (MGS) dataset with 5334 subjects using multivariate Bayesian variable selection (BVS) method Posterior Inference via Model Averaging and Subset Selection (piMASS) and compared our results with the previous univariate analysis of the MGS dataset. We showed that piMASS can improve the power of detecting schizophrenia-associated SNPs, potentially leading to new discoveries from existing data without increasing the sample size. We tested SNPs in groups to allow for local additive effects and used permutation test to determine statistical significance in order to compare our results with univariate method. The previous univariate analysis of the MGS dataset revealed no genome-wide significant loci. Using the same dataset, we identified a single region that exceeded the genome-wide significance. The result was replicated using an independent Swedish Schizophrenia Case–Control Study (SSCCS) dataset. Based on the SZGR 2.0 database we found 63 SNPs from the best performing regions that are mapped to 27 genes known to be associated with schizophrenia. Overall, we demonstrated that piMASS could discover association signals that otherwise would need a much larger sample size. Our study has important implication that reanalyzing published datasets with BVS methods like piMASS might have more power to discover new risk variants for many diseases without new sample collection, ascertainment, and genotyping

    Variable Selection with False Discovery Control

    Get PDF
    Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genome-wide association studies (GWAS). Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. In the context of survival analysis with high-dimensional covariates, this paper develops a computationally feasible method for building general risk prediction models, while controlling false discoveries. We have proposed a high-dimensional variable selection method by incorporating stability selection to control false discovery. Comparisons between the proposed method and the commonly used univariate and Lasso approaches for variable selection reveal that the proposed method yields fewer false discoveries. The proposed method is applied to study the associations of 2,339 common single-nucleotide polymorphisms (SNPs) with overall survival among cutaneous melanoma (CM) patients. The results have confirmed that BRCA2 pathway SNPs are likely to be associated with overall survival, as reported by previous literature. Moreover, we have identified several new Fanconi anemia (FA) pathway SNPs that are likely to modulate survival of CM patients

    A Penalized Multi-trait Mixed Model for Association Mapping in Pedigree-based GWAS

    Full text link
    In genome-wide association studies (GWAS), penalization is an important approach for identifying genetic markers associated with trait while mixed model is successful in accounting for a complicated dependence structure among samples. Therefore, penalized linear mixed model is a tool that combines the advantages of penalization approach and linear mixed model. In this study, a GWAS with multiple highly correlated traits is analyzed. For GWAS with multiple quantitative traits that are highly correlated, the analysis using traits marginally inevitably lose some essential information among multiple traits. We propose a penalized-MTMM, a penalized multivariate linear mixed model that allows both the within-trait and between-trait variance components simultaneously for multiple traits. The proposed penalized-MTMM estimates variance components using an AI-REML method and conducts variable selection and point estimation simultaneously using group MCP and sparse group MCP. Best linear unbiased predictor (BLUP) is used to find predictive values and the Pearson's correlations between predictive values and their corresponding observations are used to evaluate prediction performance. Both prediction and selection performance of the proposed approach and its comparison with the uni-trait penalized-LMM are evaluated through simulation studies. We apply the proposed approach to a GWAS data from Genetic Analysis Workshop (GAW) 18

    Case-control genome-wide association study of rheumatoid arthritis from Genetic Analysis Workshop 16 using penalized orthogonal-components regression-linear discriminant analysis

    Get PDF
    Currently, genome-wide association studies (GWAS) are conducted by collecting a massive number of SNPs (i.e., large p) for a relatively small number of individuals (i.e., small n) and associations are made between clinical phenotypes and genetic variation one single-nucleotide polymorphism (SNP) at a time. Univariate association approaches like this ignore the linkage disequilibrium between SNPs in regions of low recombination. This results in a low reliability of candidate gene identification. Here we propose to improve the case-control GWAS approach by implementing linear discriminant analysis (LDA) through a penalized orthogonal-components regression (POCRE), a newly developed variable selection method for large p small n data. The proposed POCRE-LDA method was applied to the Genetic Analysis Workshop 16 case-control data for rheumatoid arthritis (RA). In addition to the two regions on chromosomes 6 and 9 previously associated with RA by GWAS, we identified SNPs on chromosomes 10 and 18 as potential candidates for further investigation

    Multiple SNP Set Analysis for Genome-Wide Association Studies Through Bayesian Latent Variable Selection

    Get PDF
    The power of genome-wide association studies (GWAS) for mapping complex traits with single SNP analysis may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP-set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP-sets and complex traits. Compared to single SNP-set analysis, such joint association mapping not only accounts for the correlation among SNP-sets, but also is capable of detecting causal SNP-sets that are marginally uncorrelated with traits. The spike-slab prior assigned to the effects of SNP-sets can greatly reduce the dimension of effective SNP-sets, while speeding up computation. An efficient MCMC algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios
    • …
    corecore