20 research outputs found

    Performance evaluation based on simulated gene expression profiles with <i>m</i> = 2 conditions/groups.

    No full text
    <p>Average performance results of eight methods (ANOVA, SAM, LIMMA, eLNN, EBarrays, BetaEB, KW and Proposed) based on 100 datasets generated using a one-way ANOVA model with <i>m</i> = 2 groups/conditions and <i>σ</i><sup>2</sup> = 0.05 for both sample sizes n1 = n2 = 3 and n1 = n2 = 15. Each dataset for each case contained 300 true DE genes, and the remainder were 19700 true EE genes. The performance indices/measures TPR, FPR, TNR, FNR, FDR, MER and AUC were calculated for each method based on the top 300 estimated DE genes, under the assumption that the other estimated genes in each dataset for each case were EE genes for each method. The performance measure ‘pAUC’ was calculated at FPR = 0.2 for each method and for each dataset.</p><p>Performance evaluation based on simulated gene expression profiles with <i>m</i> = 2 conditions/groups.</p

    A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns

    No full text
    <div><p>Background</p><p>Identifying genes that are differentially expressed (DE) between two or more conditions with multiple patterns of expression is one of the primary objectives of gene expression data analysis. Several statistical approaches, including one-way analysis of variance (ANOVA), are used to identify DE genes. However, most of these methods provide misleading results for two or more conditions with multiple patterns of expression in the presence of outlying genes. In this paper, an attempt is made to develop a hybrid one-way ANOVA approach that unifies the robustness and efficiency of estimation using the minimum <i>β</i>-divergence method to overcome some problems that arise in the existing robust methods for both small- and large-sample cases with multiple patterns of expression.</p><p>Results</p><p>The proposed method relies on a <i>β</i>-weight function, which produces values between 0 and 1. The <i>β</i>-weight function with <i>β</i> = 0.2 is used as a measure of outlier detection. It assigns smaller weights (≥ 0) to outlying expressions and larger weights (≤ 1) to typical expressions. The distribution of the <i>β</i>-weights is used to calculate the cut-off point, which is compared to the observed <i>β</i>-weight of an expression to determine whether that gene expression is an outlier. This weight function plays a key role in unifying the robustness and efficiency of estimation in one-way ANOVA.</p><p>Conclusion</p><p>Analyses of simulated gene expression profiles revealed that all eight methods (ANOVA, SAM, LIMMA, EBarrays, eLNN, KW, robust BetaEB and proposed) perform almost identically for <i>m</i> = 2 conditions in the absence of outliers. However, the robust BetaEB method and the proposed method exhibited considerably better performance than the other six methods in the presence of outliers. In this case, the BetaEB method exhibited slightly better performance than the proposed method for the small-sample cases, but the the proposed method exhibited much better performance than the BetaEB method for both the small- and large-sample cases in the presence of more than 50% outlying genes. The proposed method also exhibited better performance than the other methods for <i>m</i> > 2 conditions with multiple patterns of expression, where the BetaEB was not extended for this condition. Therefore, the proposed approach would be more suitable and reliable on average for the identification of DE genes between two or more conditions with multiple patterns of expression.</p></div

    Pairwise comparison analysis by all 4 methods with their corresponding selected significance DE genes.

    No full text
    <p>The values reported in the form {x, x, x, x} in this table represent the numbers of downregulated (DR) or upregulated (UR) differentially expressed (DE) genes estimated by the ANOVA, LIMMA, KW and proposed (Bold) methods, respectively. <sup><i>a</i></sup>Note that <math><mrow>log<msub><mi>μ</mi><mo>^</mo><mi>i</mi></msub><msub><mi>μ</mi><mo>^</mo><mi>j</mi></msub><mo><</mo><mo>−</mo><mn>1</mn></mrow></math> indicates significant 2-fold downregulation and <math><mrow>log<msub><mi>μ</mi><mo>^</mo><mi>i</mi></msub><msub><mi>μ</mi><mo>^</mo><mi>j</mi></msub><mo>></mo><mo>+</mo><mn>1</mn></mrow></math> indicates significant 2-fold upregulation.</p><p>Pairwise comparison analysis by all 4 methods with their corresponding selected significance DE genes.</p

    Performance evaluation in pairwise comparison tests using four methods (ANOVA, LIMMA, KW and Proposed) for the small-sample case.

    No full text
    <p>We generated 300 DE genes out of 20,000 total genes for <i>m</i> = 4 conditions with different patterns for a small-sample case (n1 = n2 = n3 = n4 = 6) and <i>σ</i><sup>2</sup> = 0.05, with a 2-fold change in expression between the groups, to investigate the pattern-detection performance of the proposed method in comparison with the others. The values reported in the form {x, x, x, x} in this table represent the numbers of downregulated (DR) or upregulated (UR) differentially expressed (DE) genes estimated by the ANOVA, LIMMA, KW and proposed (Bold) methods, respectively. <sup><i>a</i></sup>Note that <math><mrow>log<msub><mi>μ</mi><mo>^</mo><mi>i</mi></msub><msub><mi>μ</mi><mo>^</mo><mi>j</mi></msub><mo><</mo><mo>−</mo><mn>1</mn></mrow></math> indicates significant 2-fold downregulation and <math><mrow>log<msub><mi>μ</mi><mo>^</mo><mi>i</mi></msub><msub><mi>μ</mi><mo>^</mo><mi>j</mi></msub><mo>></mo><mo>+</mo><mn>1</mn></mrow></math> indicates significant 2-fold upregulation.</p><p>Performance evaluation in pairwise comparison tests using four methods (ANOVA, LIMMA, KW and Proposed) for the small-sample case.</p

    Venn diagram and outlier gene expression profile for colon cancer data.

    No full text
    <p>Comparison of the results on the colon cancer gene expression dataset. (a) Venn diagram of the top 100 genes estimated by KW, BetaEB and the proposed method. (b) Outlying DE genes detected by the proposed method only. The results for the control group are plotted below the lines, and the results for the cancer group are plotted above the lines.</p

    Venn diagram and outlier gene expression profile for pancreatic cancer data.

    No full text
    <p>(a) Venn diagram of the DE genes estimated by all four methods (ANOVA, LIMMA, KW and Proposed) based on pairwise comparisons of CTC vs T, CTC vs P, CTC vs G, T vs P, T vs G and G vs P. (b) Frequency distributions of <i>β</i>-weights for each expression of the 8152 genes in 24 samples. (c) Scatter plot of the smallest <i>β</i>-weight for each of the 8152 genes vs. the gene index, where the smallest value represents the minimum value of 24 <i>β</i>-weights from 24 samples for each gene. The red circles between the two gray lines represent moderate/noisy outliers, whereas the other red circles, corresponding to <i>β</i>-weights of less than 0.2, represent extreme outliers. (d) Plot of ordered smallest <i>β</i>-weights in (c) for 8152 genes. (e) The 80 DE genes detected by the proposed method only, as shown in (a). Seventeen out of 80 DE genes were detected as extreme outlying genes using the <i>β</i>-weight function. The results for the T, P, G and CTC groups are plotted above the lines with four different colors. The outlying samples are indicated by circles above them.</p

    Performance evaluation based on Spike gene expression profiles with 2 conditions for the sample case (n<sub>1</sub> = n<sub>2</sub> = 9).

    No full text
    <p>We considered the estimated top 1944 genes for each method and then crossed with the designated ‘DE gene-set’ to calculate the summary statistics (TPR, TNR, FPR, FNR, FDR, MER, AUC and pAUC) for performance evaluation in the Spike gene expression profiles.</p><p>Performance evaluation based on Spike gene expression profiles with 2 conditions for the sample case (n<sub>1</sub> = n<sub>2</sub> = 9).</p

    Predicted distribution of <i>β</i> weights.

    No full text
    <p>Predicted (solid curve) and simulated (histogram) observed distributions of the <i>β</i> weights of <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0138810#pone.0138810.e015" target="_blank">Eq (5)</a>: (a) without outlying gene expressions and (b) with 5% outlying gene expressions.</p

    Integrated Analysis of Copy Number Variation and Genome-Wide Expression Profiling in Colorectal Cancer Tissues

    No full text
    <div><p>Integrative analyses of multiple genomic datasets for selected samples can provide better insight into the overall data and can enhance our knowledge of cancer. The objective of this study was to elucidate the association between copy number variation (CNV) and gene expression in colorectal cancer (CRC) samples and their corresponding non-cancerous tissues. Sixty-four paired CRC samples from the same patients were subjected to CNV profiling using the Illumina HumanOmni1-Quad assay, and validation was performed using multiplex ligation probe amplification method. Genome-wide expression profiling was performed on 15 paired samples from the same group of patients using the Affymetrix Human Gene 1.0 ST array. Significant genes obtained from both array results were then overlapped. To identify molecular pathways, the data were mapped to the KEGG database. Whole genome CNV analysis that compared primary tumor and non-cancerous epithelium revealed gains in 1638 genes and losses in 36 genes. Significant gains were mostly found in chromosome 20 at position 20q12 with a frequency of 45.31% in tumor samples. Examples of genes that were associated at this cytoband were <i>PTPRT</i>, <i>EMILIN3</i> and <i>CHD6</i>. The highest number of losses was detected at chromosome 8, position 8p23.2 with 17.19% occurrence in all tumor samples. Among the genes found at this cytoband were <i>CSMD1</i> and <i>DLC1</i>. Genome-wide expression profiling showed 709 genes to be up-regulated and 699 genes to be down-regulated in CRC compared to non-cancerous samples. Integration of these two datasets identified 56 overlapping genes, which were located in chromosomes 8, 20 and 22. MLPA confirmed that the CRC samples had the highest gains in chromosome 20 compared to the reference samples. Interpretation of the CNV data in the context of the transcriptome via integrative analyses may provide more in-depth knowledge of the genomic landscape of CRC.</p></div
    corecore