532 research outputs found
Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches
The case/pseudocontrol method provides a convenient framework for family-based association analysis of case-parent trios, incorporating several previously proposed methods such as the transmission/disequilibrium test and log-linear modelling of parent-of-origin effects. The method allows genotype and haplotype analysis at an arbitrary number of linked and unlinked multiallelic loci, as well as modelling of more complex effects such as epistasis, parent-of-origin effects, maternal genotype and mother-child interaction effects, and gene-environment interactions. Here we extend the method for analysis of quantitative as opposed to dichotomous (e.g. disease) traits. The resulting method can be thought of as a retrospective approach, modelling genotype given trait value, in contrast to prospective approaches that model trait given genotype. Through simulations and analytical derivations, we examine the power and properties of our proposed approach, and compare it to several previously proposed single-locus methods for quantitative trait association analysis. We investigate the performance of the different methods when extended to allow analysis of haplotype, maternal genotype and parent-of-origin effects. With randomly ascertained families, with or without population stratification, the prospective approach (modeling trait value given genotype) is found to be generally most effective, although the retrospective approach has some advantages with regard to estimation and interpretability of parameter estimates when applied to selected samples. Genet. Epidemiol. 31:833, 2007. © 2007 Wiley-Liss, Inc
Testing gene-environment interactions in gene-based association studies
Gene-based and single-nucleotide polymorphism (SNP) set association studies provide an important complement to SNP analysis. Kernel-based nonparametric regression has recently emerged as a powerful and flexible tool for this purpose. Our goal is to explore whether this approach can be extended to incorporate and test for interaction effects, especially for genes containing rare variant SNPs. Here, we construct nonparametric regression models that can be used to include a gene-environment interaction effect under the framework of the least-squares kernel machine and examine the performance of the proposed method on the Genetic Analysis Workshop 17 unrelated individuals data set. Two hundred simulated replicates were used to explore the power for detecting interaction. We demonstrate through a genome scan of the quantitative phenotype Q1 that the simulated gene-environment interaction effect in the data can be detected with reasonable power by using the least-squares kernel machine method
Detecting disease rare alleles using single SNPs in families and haplotyping in unrelated subjects from the Genetic Analysis Workshop 17 data
We present an evaluation of discovery power for two association tests that work well with common alleles but are applied to the Genetic Analysis Workshop 17 simulations with rare causative single-nucleotide polymorphisms (SNPs) (minor allele frequency [MAF] < 1%). The methods used were genome-wide single-SNP association tests based on a linear mixed-effects model for discovery and applied to the familial sample and sliding windows haplotype association tests for replication, implemented within causative genes in the unrelated individuals sample. Both methods are evaluated with respect to the simulated trait Q2. The linear mixed-effects model and haplotype association tests failed to detect the rare alleles of the simulated associations. In contrast, the linear mixed-effects model and haplotype association tests detected effects for the most important simulated SNPs with MAF > 1%. We conclude that these findings reflect inadequate statistical power (the result of small simulated samples) for the complex genetic model that underlies these data
Linkage disequilibrium in young genetically isolated Dutch population
The design and feasibility of genetic studies of complex diseases are critically dependent on the extent and distribution of linkage disequilibrium (LD) across the genome and between different populations. We have examined genomewide and region-specific LD in a young genetically isolated population identified in the Netherlands by genotyping approximately 800 Short Tandem Repeat markers distributed genomewide across 58 individuals. Several regions were an
SUP: an extension to SLINK to allow a larger number of marker loci to be simulated in pedigrees conditional on trait values
BACKGROUND: With the recent advances in high-throughput genotyping technologies that allow for large-scale association mapping of human complex traits, promising statistical designs and methods have been emerging. Efficient simulation software are key elements for the evaluation of the properties of new statistical tests. SLINK is a flexible simulation tool that has been widely used to generate the segregation and recombination processes of markers linked to, and possibly associated with, a trait locus, conditional on trait values in arbitrary pedigrees. In practice, its most serious limitation is the small number of loci that can be simulated, since the complexity of the algorithm scales exponentially with this number. RESULTS: I describe the implementation of a two-step algorithm to be used in conjunction with SLINK to enable the simulation of a large number of marker loci linked to a trait locus and conditional on trait values in families, with the possibility for the loci to be in linkage disequilibrium. SLINK is used in the first step to simulate genotypes at the trait locus conditional on the observed trait values, and also to generate an indicator of the descent path of the simulated alleles. In the second step, marker alleles or haplotypes are generated in the founders, conditional on the trait locus genotypes simulated in the first step. Then the recombination process between the marker loci takes place conditionally on the descent path and on the trait locus genotypes. This two-step implementation is often computationally faster than other software that are designed to generate marker data linked to, and possibly associated with, a trait locus. CONCLUSION: Because the proposed method uses SLINK to simulate the segregation process, it benefits from its flexibility: the trait may be qualitative with the possibility of defining different liability classes (which allows for the simulation of gene-environment interactions or even the simulation of multi-locus effects between unlinked susceptibility regions) or it may be quantitative and normally distributed. In particular, this implementation is the only one available that can generate a large number of marker loci conditional on the set of observed quantitative trait values in pedigrees
Detecting rare functional variants using a wavelet-based test on quantitative and qualitative traits
We conducted a genome-wide association study on the Genetic Analysis Workshop 17 simulated unrelated individuals data using a multilocus score test based on wavelet transformation that we proposed recently. Wavelet transformation is an advanced smoothing technique, whereas the currently popular collapsing methods are the simplest way to smooth multilocus genotypes. The wavelet-based test suppresses noise from the data more effectively, which results in lower type I error rates. We chose a level-dependent threshold for the wavelet-based test to suppress the optimal amount of noise according to the data. We propose several remedies to reduce the inflated type I error rate: using a window of fixed size rather than a gene; using the Bonferroni correction rather than comparing to the maxima of test values for multiple testing corrections; and removing the influence of other factors by using residuals for the association test. A wavelet-based test can detect multiple rare functional variants. Type I error rates can be controlled using the wavelet-based test combined with the mentioned remedies
Empirical vs Bayesian approach for estimating haplotypes from genotypes of unrelated individuals
BACKGROUND: The completion of the HapMap project has stimulated further development of haplotype-based methodologies for disease associations. A key aspect of such development is the statistical inference of individual diplotypes from unphased genotypes. Several methodologies for inferring haplotypes have been developed, but they have not been evaluated extensively to determine which method not only performs well, but also can be easily incorporated in downstream haplotype-based association analyses. In this paper, we attempt to do so. Our evaluation was carried out by comparing the two leading Bayesian methods, implemented in PHASE and HAPLOTYPER, and the two leading empirical methods, implemented in PL-EM and HPlus. We used these methods to analyze real data, namely the dense genotypes on X-chromosome of 30 European and 30 African trios provided by the International HapMap Project, and simulated genotype data. Our conclusions are based on these analyses. RESULTS: All programs performed very well on X-chromosome data, with an average similarity index of 0.99 and an average prediction rate of 0.99 for both European and African trios. On simulated data with approximation of coalescence, PHASE implementing the Bayesian method based on the coalescence approximation outperformed other programs on small sample sizes. When the sample size increased, other programs performed as well as PHASE. PL-EM and HPlus implementing empirical methods required much less running time than the programs implementing the Bayesian methods. They required only one hundredth or thousandth of the running time required by PHASE, particularly when analyzing large sample sizes and large umber of SNPs. CONCLUSION: For large sample sizes (hundreds or more), which most association studies require, the two empirical methods might be used since they infer the haplotypes as accurately as any Bayesian methods and can be incorporated easily into downstream haplotype-based analyses such as haplotype-association analyses
The Use of Haplotypes in the Identification of Interaction between SNPs
Although haplotypes can provide great insight into the complex relationships between functional polymorphisms at a locus, their use in modern association studies has been limited. This is due to our inability to directly observe haplotypes in studies of unrelated individuals, but also to the extra complexity involved in their analysis and the difficulty in identifying which is the truly informative haplotype. Using a series of simulations, we tested a number of different models of a haplotype carrying two functional single nucleotide polymorphisms (SNPs) to assess the ability of haplotypic analysis to identify functional interactions between SNPs at the same locus. We found that, when phase is known, analysis of the haplotype is more powerful than analysis of the individual SNPs. The difference between the two approaches becomes less either as an increasing number of non-informative SNPs are included, or when the haplotypic phase is unknown, while in both cases the SNP association becomes progressively better at identifying the association. Our results suggest that when novel genotyping and bioinformatics methods are available to reconstruct haplotypic phase, this will permit the emergence of a new wave of haplotypic analysis able to consider interactions between SNPs with increased statistical power.</p
- …