13 research outputs found
Region-Based Association Test for Familial Data under Functional Linear Models
<div><p>Region-based association analysis is a more powerful tool for gene mapping than testing of individual genetic variants, particularly for rare genetic variants. The most powerful methods for regional mapping are based on the functional data analysis approach, which assumes that the regional genome of an individual may be considered as a continuous stochastic function that contains information about both linkage and linkage disequilibrium. Here, we extend this powerful approach, earlier applied only to independent samples, to the samples of related individuals. To this end, we additionally include a random polygene effects in functional linear model used for testing association between quantitative traits and multiple genetic variants in the region. We compare the statistical power of different methods using Genetic Analysis Workshop 17 mini-exome family data and a wide range of simulation scenarios. Our method increases the power of regional association analysis of quantitative traits compared with burden-based and kernel-based methods for the majority of the scenarios. In addition, we estimate the statistical power of our method using regions with small number of genetic variants, and show that our method retains its advantage over burden-based and kernel-based methods in this case as well. The new method is implemented as the R-function ‘famFLM’ using two types of basis functions: the B-spline and Fourier bases. We compare the properties of the new method using models that differ from each other in the type of their function basis. The models based on the Fourier basis functions have an advantage in terms of speed and power over the models that use the B-spline basis functions and those that combine B-spline and Fourier basis functions. The ‘famFLM’ function is distributed under GPLv3 license and is freely available at <a href="http://mga.bionet.nsc.ru/soft/famFLM/" target="_blank">http://mga.bionet.nsc.ru/soft/famFLM/</a>.</p></div
The statistical power of regional association analysis on the familial data when only rare variants were used in simulations for random selection of causal variants and in analysis.
<p>The notations of the methods are the same as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0128999#pone.0128999.g001" target="_blank">Fig 1</a>.</p
The statistical power of regional association analysis on the familial data when only rare variants were used in simulations for random selection of causal variants and 50% of non-causal variants were excluded from the analysis.
<p>The notations of the methods are the same as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0128999#pone.0128999.g001" target="_blank">Fig 1</a>.</p
The statistical power of regional association analysis on the familial data when only rare variants were used in simulations for random selection of causal variants and 80% of non-causal variants were excluded from the analysis.
<p>The notations of the methods are the same as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0128999#pone.0128999.g001" target="_blank">Fig 1</a>.</p
Comparison of the <i>P</i> values (shown as minus base 10 logarithm) computed with famSKAT, ASKAT, and FFBSKAT given a sample of 500 individuals, for two causal genes, <i>FLT1</i> and <i>VEGFA</i>.
<p>200 realizations of Q1 quantitative trait in GAW17 data were analyzed. The line indicates one-to-one correspondence.</p
Simulation results of type I error rates of six famFLM tests.
<p>Simulation results of type I error rates of six famFLM tests.</p
The statistical power of regional association analysis on the familial data when all (rare and common) variants were used in simulations for random selection of causal variants and in analysis.
<p>Compared methods are the burden-based (famBT), kernel-based (famSKAT), optimized kernel-based (famSKAT-O), and the new FDA-based (famFLM) methods. For famFLM, six functional models were tested: B-spline basis for both the GVF and the BSF (B-B), only the BSF described via B-spline basis (0-B), Fourier basis for the GVF and B-spline basis for the BSF (F-B), B-spline basis for the GVF and Fourier basis for the BSF (B-F), only the BSF described via Fourier basis (0-F), Fourier basis for both the GVF and the BSF (F-F).</p
Dependence of the running times of the second step of mini-exome analysis of quantitative trait Q1 on sample size for different methods (using one processor at 3.07 GHz).
<p>Points show the estimated running times (<i>RT</i>), lines correspond to the linear regression equations: <i>RT</i><sub>ASKAT</sub> = 9×10<sup>−6 </sup><i>n</i><sup>3</sup>–.753; <i>RT</i><sub>famSKAT</sub> = 6.7×10<sup>–5 </sup><i>n</i><sup>2</sup>–2.8, and <i>RT</i><sub>FFBSKAT</sub> = 1.7×10<sup>–5 </sup><i>n</i><sup>2</sup>–3.7, where <i>n</i> is the sample size.</p
The statistical power of regional association analysis with weighted FLM on the familial data with effect modeled as |<i>β</i><sub><i>j</i></sub>| = log(<i>s</i>)|log<sub>10</sub>(MAF<sub><i>j</i></sub>)|/2 and all causal variants having MAFs ≤ 0.03.
<p>Proportion of causal variants is the proportion of all rare variants (MAF ≤ 0.03) within the region (all rare variants = 100%). B—B-spline basis functions; F—Fourier basis functions; (1, 1)—the unweighted model; (0.5, 0.5)—the weighted model with <i>a</i><sub>1</sub> = <i>a</i><sub>2</sub> = 0.5; (1, 25)—the weighted model with <i>a</i><sub>1</sub> = 1 and <i>a</i><sub>2</sub> = 25.</p
Power for three trait transformations of two GAW17 phenotypes.
<p>See legend in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0065395#pone-0065395-g001" target="_blank">Fig. 1</a> for coding of weight function modes. Error bars indicate the standard errors.</p