34 research outputs found

    DC map of modules enriched with KEGG pathways discovered in the Alzheimer's disease data.

    No full text
    <p>(A) DC map of modules enriched with KEGG pathways. Nodes represent gene modules and edges correspond to DC (blue for increased correlation in AD, red for decreased correlation). Node size is proportional to the size of the module. The enriched pathways are noted on the module. NDD pathways refer to Parkinson's disease (PD), Huntington's disease, Alzheimer's disease and oxidative phosphorylation. CAMs refer to the cell adhesion molecules pathway. (B) Analysis of DC between the PD and the NDD modules (the circled sub-graph in A). Left: the known interactions involving the genes of the two modules according to GENEMANIA. Most known interactions are between the modules. Right: co-expression networks of the same genes for AD patients and controls. Rectangular nodes are genes related to oxidoreductase activity, hexagons indicate genes related to phosphate metabolic process. An edge between two genes indicates correlation >0.3 in the tested class. The average correlation between the modules was 0.3 in the controls and 0 in the AD class. Node colors indicate the DE between case and control, measured by the base-10 logarithm of the p-value (t-test) of the tested gene. The genes circled in the NDD pathway module are also part of the PD pathway. These genes are also down-correlated in AD, whereas all other genes show only mild DE.</p

    KEGG pathway enrichment analysis.

    No full text
    <p>The modules found by DiffCoEx and DICER were tested for KEGG pathway enrichment using the hypergeometric test with 0.05 FDR correction. Neither method reported significant enrichment on the IBD data set. (A) The number of enriched pathways. (B) Average enrichment factors of the enriched sets. The enrichment factor is the ratio between the fraction of the pathway genes in the tested set and the fraction of the pathway genes in the data set.</p

    Simulation results: Independent and dependent studies.

    No full text
    <p>A) 20 independent studies. B) Four clusters of 10 studies each with a low correlation within each cluster. C) Four clusters of 10 studies each with a high correlation within each cluster. The non-null distribution in each case was Beta(1,100). Two-groups estimation was done using normix. The left column shows the empirical FDP (values above 0.4 are not shown). The right column shows the Jaccard scores only for methods that had consistently low FDP values (< 0.2) for all <i>k</i> values. BH-count, SCREEN and SCREEN-ind (SCRN-ind for short) are the only methods that had <i>FDP</i> ≤ 0.2 in all cases. BH-count had very low FDP but also very low Jaccard scores. Except for <i>k</i> = 2, SCREEN had similar or better performance compared to SCREEN-ind.</p

    Comparison of absolute difference in correlations in gene sets found by different algorithms.

    No full text
    <p>(A) The extent of DC compared to random gene sets. For each discovered module and module pair we created 200 random gene sets of the same size and calculated their absolute DC. We then calculated the ratio between the scores of the discovered modules and the mean of the random gene sets. The green bars show the mean of the top two DiffCoEx modules in each data set. For testing DiffCoEx and CLICK module pairs (purple and blue bars respectively), we took into account only module pairs with fold change greater than 1.1. CoXpress found no significant clusters of 15 genes. For DICER (red bars), the top ten up-correlated and the top ten down-correlated module pairs were taken into account. (B) The distribution of within- and between-module absolute change in correlation for DICER and DiffCoEx in the AD and lung cancer data sets.</p

    The largest connected component produced by the 405 genes reported by SCREEN in the HLA dataset.

    No full text
    <p>Nodes are genes, and edges are either protein-protein interactions or known pathway interactions. Left: all genes in the component, including up-, down-regulated, and mixed genes. Right: the up-regulated genes only. This subnetwork suggests high activity of immune response (which was identified in the original study). Two central genes in the immune response are INFG and JAK2. Our analysis detected both, whereas the original study detected only IFNG. Moreover, the connectivity of the network is established by our newly detected genes.</p

    Examples of differential correlation patterns.

    No full text
    <p>(A) An up-correlated 242-gene cluster discovered in the AD data set. The correlation matrices of the cluster genes in the AD and control classes are shown. The average correlation is 0.72 and 0.44 in the AD and the control classes, respectively. (B) A down-correlated meta-module discovered in the lung cancer data. It contains two gene modules of sizes 39 and 77. The correlation matrices of the meta-module genes are shown for the lung cancer and the control classes. The correlation between the two modules is −0.43 in the control class, whereas the correlation in the lung cancer class drops to −0.86. Each module is a group of genes that are highly correlated in both classes: the average correlation within each module is >0.75. (C) The correlation between genes RAD23B and ALPK1 in the lung cancer data. The two genes are marked by arrows in B. Each dot corresponds to an individual and the axes mark the base-2 logarithm of expression values of the two genes in that individual. The genes are negatively correlated in the lung cancer class (r = −0.76) but are uncorrelated in the controls (r = −0.12). See <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002955#pcbi.1002955.s013" target="_blank">Text S2</a> for additional examples using simulated data.</p

    Ribosomal sub-complexes discovered in the Alzheimer's disease (AD) data.

    No full text
    <p>(A) A DC map of modules enriched with protein complexes. Node size is proportional to the size of the module. The enriched pathway names are noted on the module. 40S: 40S cytoplasmatic Ribosome complex, 60S: 60S cytoplasmatic Ribosome complex, Nop56: Nop56p-associated pre-rRNA complex. Blue and red edges mark increased and decreased correlation in AD, respectively. (B) Analysis of DC in the Ribosome and 60S-Nop56 meta-module circled in A. Left: the known interactions involving the genes of the two modules according to GENEMANIA. Right: co-expression networks of the same genes for AD patients and controls. An edge between two genes indicates correlation >0.5 in the tested class. The average correlation between modules was 0.4 and 0.75 in the controls and AD class, respectively. Node colors show DE between AD and control, measured as the base-10 logarithm of the p-value (t-test) of the tested gene. Circled subgroups: proteins belonging to 40S cytoplasmatic Ribosome and Nop56 complex. 40S complex genes are up-regulated in AD, whereas 60S genes show only mild DE.</p

    T-score distributions in real and permuted data sets.

    No full text
    <p>(A) The distributions of the T-scores in the real (blue) and permuted (red) data sets. The variance of the distributions is larger for the T-scores on the real data, even though the means are similar. Since in the IBD and SLE data sets most T-scores are close to zero, we also show the upper tails of their distributions. (B) The standard deviation of the T-scores in the real and permuted data sets. The standard deviation is larger in all real data sets, indicating that high T-scores (in absolute value) are more probable in the real data sets. Permuted data sets were generated by shuffling sample labels. Results are the average of 50 permutations.</p

    DEG dataset analysis.

    No full text
    <p>A) The number of reported genes at 0.2 <i>fdr</i> by SCREEN and SCREEN-ind as a function of <i>k</i>. B) The Spearman correlation between gene ranking of SCREEN and SCREEN-ind and of Fisher’s meta-analysis as a function of <i>k</i>. C) The top ranked genes and their p-values. Top: the p-values of TOP2A, ATP6V1D, and GNPDA1 in each study. Bottom: the rank of these genes according to each of the methods (with k = 20 for SCREEN and SCREEN-ind). ATP6V1D has a very low rank according to Fisher’s meta-analysis even though it has consistently low p-values. D) Network analysis of the 147 genes reported by SCREEN with <i>k</i> = 20. Nodes are genes, and edges are either protein-protein interactions or known pathway interactions. The largest connected component is shown in detail, and the rest of the genes and their interactions are shown at the bottom left. For each gene we calculated the number of up- and down-regulated t-statistics with a p-value ≤ 0.01. Genes for which the ratio between the up- and down events was ≥ 3 (≤ 1/3) were considered consistently up-regulated (down-regulated) in cancer (red and green nodes, 99 and 18 genes, respectively). All other genes were considered as mixed (blue nodes, 30 genes). Oval nodes represent genes ranked among the top 200 genes according to Fisher’s meta-analysis (note that even at 10<sup>−5</sup> Bonferroni correction, more than 10,000 genes were selected in the meta-analysis, We therefore compared to the topmost genes, choosing the number 200 arbitrarily). Rectangular nodes are genes detected only by SCREEN.</p

    Overview of the class specific differential correlation (DC) analysis.

    No full text
    <p>The input (left) is a set of expression profiles from different classes of samples. In one analysis (top center), T-scores are computed for the class of interest and are normalized using the T-scores calculated on random data sets, created by shuffling the sample labels. The normalized scores are then used to find gene clusters that manifest DC in the tested class compared to all other classes (top right, up/down-correlated modules; blue edges indicate class-specific DC). A second similarity analysis (bottom center) is performed to detect gene pairs that are co-expressed in all classes. In each class, an EM algorithm is used to divide the correlations to high (‘denoted “mates,” red distribution) and low (denoted “non-mates,” green distribution), and consistent similarities are defined as cases in which gene pairs are mates in all classes. The two scores are used to find pairs of gene modules in which each module is a group of consistently correlated genes (red edges), whereas the correlation between the modules is differential (blue edges). These module pairs are denoted as meta-modules (center right). As a by-product, individual modules are recorded (bottom right).</p
    corecore