850 research outputs found
A Weighted U Statistic for Genetic Association Analyses of Sequencing Data
With advancements in next generation sequencing technology, a massive amount
of sequencing data are generated, offering a great opportunity to
comprehensively investigate the role of rare variants in the genetic etiology
of complex diseases. Nevertheless, this poses a great challenge for the
statistical analysis of high-dimensional sequencing data. The association
analyses based on traditional statistical methods suffer substantial power loss
because of the low frequency of genetic variants and the extremely high
dimensionality of the data. We developed a weighted U statistic, referred to as
WU-seq, for the high-dimensional association analysis of sequencing data. Based
on a non-parametric U statistic, WU-SEQ makes no assumption of the underlying
disease model and phenotype distribution, and can be applied to a variety of
phenotypes. Through simulation studies and an empirical study, we showed that
WU-SEQ outperformed a commonly used SKAT method when the underlying assumptions
were violated (e.g., the phenotype followed a heavy-tailed distribution). Even
when the assumptions were satisfied, WU-SEQ still attained comparable
performance to SKAT. Finally, we applied WU-seq to sequencing data from the
Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very
low density lipoprotein cholesterol
Testing gene-environment interactions in gene-based association studies
Gene-based and single-nucleotide polymorphism (SNP) set association studies provide an important complement to SNP analysis. Kernel-based nonparametric regression has recently emerged as a powerful and flexible tool for this purpose. Our goal is to explore whether this approach can be extended to incorporate and test for interaction effects, especially for genes containing rare variant SNPs. Here, we construct nonparametric regression models that can be used to include a gene-environment interaction effect under the framework of the least-squares kernel machine and examine the performance of the proposed method on the Genetic Analysis Workshop 17 unrelated individuals data set. Two hundred simulated replicates were used to explore the power for detecting interaction. We demonstrate through a genome scan of the quantitative phenotype Q1 that the simulated gene-environment interaction effect in the data can be detected with reasonable power by using the least-squares kernel machine method
SUP: an extension to SLINK to allow a larger number of marker loci to be simulated in pedigrees conditional on trait values
BACKGROUND: With the recent advances in high-throughput genotyping technologies that allow for large-scale association mapping of human complex traits, promising statistical designs and methods have been emerging. Efficient simulation software are key elements for the evaluation of the properties of new statistical tests. SLINK is a flexible simulation tool that has been widely used to generate the segregation and recombination processes of markers linked to, and possibly associated with, a trait locus, conditional on trait values in arbitrary pedigrees. In practice, its most serious limitation is the small number of loci that can be simulated, since the complexity of the algorithm scales exponentially with this number. RESULTS: I describe the implementation of a two-step algorithm to be used in conjunction with SLINK to enable the simulation of a large number of marker loci linked to a trait locus and conditional on trait values in families, with the possibility for the loci to be in linkage disequilibrium. SLINK is used in the first step to simulate genotypes at the trait locus conditional on the observed trait values, and also to generate an indicator of the descent path of the simulated alleles. In the second step, marker alleles or haplotypes are generated in the founders, conditional on the trait locus genotypes simulated in the first step. Then the recombination process between the marker loci takes place conditionally on the descent path and on the trait locus genotypes. This two-step implementation is often computationally faster than other software that are designed to generate marker data linked to, and possibly associated with, a trait locus. CONCLUSION: Because the proposed method uses SLINK to simulate the segregation process, it benefits from its flexibility: the trait may be qualitative with the possibility of defining different liability classes (which allows for the simulation of gene-environment interactions or even the simulation of multi-locus effects between unlinked susceptibility regions) or it may be quantitative and normally distributed. In particular, this implementation is the only one available that can generate a large number of marker loci conditional on the set of observed quantitative trait values in pedigrees
Automated construction and testing of multi-locus geneāgene associations
Summary: It has been argued that the missing heritability in common diseases may be in part due to rare variants and geneāgene effects. Haplotype analyses provide more power for rare variants and joint analyses across genes can address multi-gene effects. Currently, methods are lacking to perform joint multi-locus association analyses across more than one gene/region. Here, we present a haplotype-mining geneāgene analysis method, which considers multi-locus data for two genes/regions simultaneously. This approach extends our single region haplotype-mining algorithm, hapConstructor, to two genes/regions. It allows construction of multi-locus SNP sets at both genes and tests joint geneāgene effects and interactions between single variants or haplotype combinations. A Monte Carlo framework is used to provide statistical significance assessment of the joint and interaction statistics, thus the method can also be used with related individuals. This tool provides a flexible data-mining approach to identifying geneāgene effects that otherwise is currently unavailable
Mutational landscape of candidate genes in familial prostate cancer
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/108266/1/pros22849-sm-0001-SupTab-S1.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/108266/2/pros22849.pd
- ā¦