16 research outputs found

    To Control False Positives in Gene-Gene Interaction Analysis: Two Novel Conditional Entropy-Based Approaches

    Get PDF
    <div><p>Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects.</p></div

    Chi-squared Q-Q plots for the additive-additive model with main effect at both locus (Schema 3).

    No full text
    <p>Top panels: <b>A</b>. <i>GenoMI</i>; <b>B</b>. <i>GenoCMI</i>; <b>C</b>. <i>GameteCMI</i>. Middle panels: <b>D</b>. original Wu et al statistic; <b>E</b>. adjusted Wu statistic; <b>F</b>. joint effect statistic. Bottom panel: <b>G</b>. logistic regression model with 1 df test; <b>H</b>. logistic regression model with 4 df test.</p

    Comparison of <i>P</i>-values in testing gene-gene interaction between hemoglobin (<i>Hb</i>) gene and <i>α</i><sup>+</sup>-thalassemia gene.

    No full text
    a<p>frequencies were shown as No. of case/No. of control.</p>b<p>P-values reported by Williams et al.</p>c<p>the lowest <i>P</i>-value among logistic regression models by assuming additive × additive, dominant × dominant and recessive × recessive interaction models, respectively.</p>d<p>obtained by logistic regression model by coding genotypes as factors.</p

    Description of simulation schemas.

    No full text
    a<p>In each schema, three two-locus interaction models (additive × additive, dominant × dominant and recessive × recessive) were evaluated.</p>b<p><i>OR<sub>G</sub></i>, <i>OR<sub>H</sub></i>, and <i>OR<sub>GH</sub></i> denote the main effect for locus <i>G</i>, main effect for locus <i>H</i>, and their interaction effect, respectively. “√” indicates that the effect is present. “–” indicates that the effect is absent.</p>c<p>Disease prevalence (baseline penetrance).</p>d<p>For Schemas 8 and 9, the interaction effect <i>OR<sub>GH</sub></i> was increased from 1.0 to a value at which the power of the optimal metric achieved 100% at significance level 0.01.</p

    Null distribution of the <i>GenoCMI</i> and <i>GameteCMI</i> metrics.

    No full text
    <p><b>A</b>. The empirically null distribution of <i>GenoCMI</i>, compared to its theoretical distribution <i>χ</i><sup>2</sup><sub>(8)</sub>. <b>B</b>. The empirically null distribution of <i>GameteCMI</i>, compared to its theoretical distribution <i>χ</i><sup>2</sup><sub>(2)</sub>.</p

    Application of entropy-based statistics for testing gene-gene interaction between SNP309 in <i>MDM2</i> gene and codon72 polymorphism in <i>p53</i> gene.

    No full text
    a<p>frequencies were shown as No. of individuals genotyped as <i>TT</i>/<i>TG</i>/<i>GG</i> of <i>MDM2 309T</i>><i>G</i> in case.</p>b<p>frequencies were shown as No. of individuals genotyped as <i>TT</i>/<i>TG</i>/<i>GG</i> of <i>MDM2 309T</i>><i>G</i> in control.</p>1<p>obtained by logistic regression model assuming additive × additive model.</p>2<p>obtained by logistic regression model assuming dominant × dominant model.</p>3<p>obtained by logistic regression model assuming recessive × recessive model.</p>4<p>obtained by logistic regression model by coding genotypes as factors.</p><p>GCC: gaster cardia cancer; LC: lung cancer; HCC: hepatacelluar cancer; BC: breast cancer.</p

    Chi-squared Q-Q plots for the recessive-recessive model with main effect at both loci, when case/control ratios varied (Schema 7).

    No full text
    <p>Assuming main effects at both locus (<i>OR<sub>G</sub></i> = <i>OR<sub>H</sub></i> = 2.0) and disease prevalence 0.02. Top panels: <b>A</b>. <i>GenoMI</i>; <b>B</b>. <i>GenoCMI</i>; <b>C</b>. <i>GameteCMI</i>. Middle panels: <b>D</b>. original Wu et al statistic; <b>E</b>. adjusted Wu statistic; <b>F</b>. joint effect statistic. Bottom panel: <b>G</b>. logistic regression model with 1 df test; <b>H</b>. logistic regression model with 4 df test.</p

    Chi-squared Q-Q plots for the global null hypothesis (Schema 1).

    No full text
    <p>Top panels: <b>A</b>. <i>GenoMI</i>; <b>B</b>. <i>GenoCMI</i>; <b>C</b>. <i>GameteCMI</i>. Middle panels: <b>D</b>. original Wu et al statistic; <b>E</b>. adjusted Wu statistic; <b>F</b>. joint effect statistic. Bottom panel: <b>G</b>. logistic regression model with 1 df test; <b>H</b>. logistic regression model with 4 df test.</p

    Chi-squared Q-Q plots for the dominant-donimant model with main effect at both locus (Schema 3).

    No full text
    <p>Top panels: <b>A</b>. <i>GenoMI</i>; <b>B</b>. <i>GenoCMI</i>; <b>C</b>. <i>GameteCMI</i>. Middle panels: <b>D</b>. original Wu et al statistic; <b>E</b>. adjusted Wu statistic; <b>F</b>. joint effect statistic. Bottom panel: <b>G</b>. logistic regression model with 1 df test; <b>H</b>. logistic regression model with 4 df test.</p

    False positive rates (type 1 error rates) for testing interaction in common disease with main effect at one locus (Schema 2).

    No full text
    a<p>logistic regression model with 1 df test for the correct genetic model.</p>b<p>logistic regression model with 4 df test by coding genotypes as factors.</p><p>The disease prevalence is assumed 0.02. The significance level is set as 0.01.</p
    corecore