34 research outputs found

    Performance comparison for C3D, WGCNA and DiffCoEx methods

    No full text
    <p><i>Top</i>, three cluster types (“common” “nested” and “overlapping”) were simulated in conditions where the cluster size () is reported for both the intersection and union part of the clusters. <i>Bottom</i>, for each method the average TPR and FPR () across 20 replicated datasets were calculated and reported for the simulated cluster densities. For C3D analysis (blue lines) we required each cluster to be detected with a misclassification error rate (MER) of 5% or 20% and . For WGCNA (red line) and DiffCoEx (green line) we considered two “default values” for the cut-off threshold, which were chosen according to the WGCNA guidelines (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004006#pgen.1004006.s010" target="_blank">Text S1</a> for details).</p

    Co-expression clusters identified in all rat tissues.

    No full text
    <p>For each rat cluster detected in all seven tissues we report the number of probe sets, the top five functional categories and their statistical significance (full list in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004006#pgen.1004006.s006" target="_blank">Table S2</a>), the summary of cell-type enrichment statistics expressed as (Benjamini and Hochberg (BH)-adjusted <i>p</i>-value, Cten analysis) and the graph with the significant protein-protein interactions (PPI), including the overall significance of the directed PPI network (DAPPLE analysis). The colour scale on the right indicate the significance of the detected PPI.</p

    Multi-tissue Analysis of Co-expression Networks by Higher-Order Generalized Singular Value Decomposition Identifies Functionally Coherent Transcriptional Modules

    Get PDF
    <div><p>Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing <i>ad-hoc</i> input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (<i>Bag3</i>, <i>Cryab</i>, <i>Kras</i>, <i>Emd</i>, <i>Plec</i>), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.</p></div

    Description of the cluster structures used in the simulation studies.

    No full text
    <p>We simulated three cluster types: “common” (<i>Cluster pattern 1</i>), “nested” (<i>Cluster pattern 2</i>) and “overlapping” (<i>Cluster pattern 3</i>) that are shared across three or more conditions. For <i>Cluster pattern 2</i> and <i>Cluster pattern 3</i>, the “intersection cluster” is defined by the nodes in common to all conditions (red square) whereas the “union cluster” is defined by the nodes in common to all conditions plus the nodes present in individual conditions (black square).</p

    <i>Rat cluster 1</i> shows co-expression between Hsp and cardiomyopathy genes which is conserved with human heart and liver tissues

    No full text
    <p>(A) Network of 135 annotated rat genes identified by C3D as co-expressed in heart, aorta, liver and skeletal muscle tissues (). In each tissue we selected the top 5% of edges based on the (absolute) covariance between gene expression profiles and then calculated the average covariance across the four tissues. Edges are represented by lines connecting nodes (genes) and the thickness of the line is proportional to the average covariance value. Within the network, heat shock protein (Hsp) and cardiomyopathy genes are highlighted in blue and red, respectively. The Kendall correlations between the expression profiles of Hsp and cardiomyopathy genes are graphically represented as sub-networks separately for each tissue. Line thickness is proportional to the value of the Kendall correlation. (B) Enrichment for functional categories (, full list in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004006#pgen.1004006.s006" target="_blank">Table S2</a>) and for disease association (adjusted , details in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004006#pgen.1004006.s007" target="_blank">Table S3</a>). (C) Significant protein-protein interaction (PPI) network () where the Hsp and cardiomyopathy genes showing conserved PPI are highlighted (blue and red circles). (D) Conserved co-expression network detected in heart tissue samples from patients with advanced idiopathic or ischemic cardiomyopathy. The network includes all human orthologous genes of the genes in <i>rat cluster 1</i> that have significant edges by covariance selection (). (E) Conserved co-expression network detected in liver tissue samples from healthy volunteers. The network includes all human orthologous genes of the genes in <i>rat cluster 1</i> that have significant edges by covariance selection ().</p

    Human co-expression cluster 1.

    No full text
    <p><i>Top left</i>, each node in the network represents a gene and, in keeping with <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004006#pgen.1004006-Liu1" target="_blank">[61]</a>, for each gene we highlight significant up-regulation in VZ (red) or CP (green) as compared with the other neocortex regions. Genes that are were not differentially expressed between neocortex regions are coloured in grey. Genes present in relevant KEGG pathways (p53 signaling, ECM-receptor interaction, Cell cycle and DNA replication) are extracted from the main network and highlighted. <i>Top right</i>, functional annotation for the network: top five significant GO biological processes and KEGG pathways (full list in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004006#pgen.1004006.s007" target="_blank">Table S3</a>). <i>Bottom left</i>, summary of cell-type enrichment analysis expressed as (Benjamini and Hochberg (BH)-adjusted <i>p</i>-value, Cten analysis). <i>Bottom right</i>, graph with the significant protein-protein interactions (PPI), including the overall significance of the directed PPI network (DAPPLE analysis, ). The colour scale on the right indicate the significance of the detected PPI.</p

    Graphical Modeling of Gene Expression in Monocytes Suggests Molecular Mechanisms Explaining Increased Atherosclerosis in Smokers

    No full text
    <div><p>Smoking is a risk factor for atherosclerosis with reported widespread effects on gene expression in circulating blood cells. We hypothesized that a molecular signature mediating the relation between smoking and atherosclerosis may be found in the transcriptome of circulating monocytes. Genome-wide expression profiles and counts of atherosclerotic plaques in carotid arteries were collected in 248 smokers and 688 non-smokers from the general population. Patterns of co-expressed genes were identified by Independent Component Analysis (ICA) and network structure of the pattern-specific gene modules was inferred by the PC-algorithm. A likelihood-based causality test was implemented to select patterns that fit models containing a path “smoking→gene expression→plaques”. Robustness of the causal inference was assessed by bootstrapping. At a FDR ≤0.10, 3,368 genes were associated to smoking or plaques, of which 93% were associated to smoking only. <em>SASH1</em> showed the strongest association to smoking and <em>PPARG</em> the strongest association to plaques. Twenty-nine gene patterns were identified by ICA. Modules containing <em>SASH1</em> and <em>PPARG</em> did not show evidence for the “smoking→gene expression→plaques” causality model. Conversely, three modules had good support for causal effects and exhibited a network topology consistent with gene expression mediating the relation between smoking and plaques. The network with the strongest support for causal effects was connected to plaques through <em>SLC39A8</em>, a gene with known association to HDL-cholesterol and cellular uptake of cadmium from tobacco, while smoking was directly connected to <em>GAS6</em>, a gene reported to have anti-inflammatory effects in atherosclerosis and to be up-regulated in the placenta of women smoking during pregnancy. Our analysis of the transcriptome of monocytes recovered genes relevant for association to smoking and atherosclerosis, and connected genes that before, were only studied in separate contexts. Inspection of correlation structure revealed candidates that would be missed by expression-phenotype association analysis alone.</p> </div

    Subnetwork of PC skeleton for Module 21.

    No full text
    <p>This graph represents a consensus network from 1000 bootstraps. Edges among variables are drawn if detected in at least 60% of bootstrapped samples. The recovery percentages are indicated to the right of the medial section of each edge. Line thickness is proportional to the edge's partial correlation. Black edges denote positive and pink edges negative partial correlations. Plaques and risk factors are in blue. Genes directly connected to smoking are in green and those directly connected to plaques are in orange. Other genes are in gray. Only genes that are involved in the shortest paths connecting smoking to plaques are shown. The full network for this and other patterns are found in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0050888#pone.0050888.s017" target="_blank">Text S2</a>.</p

    Characteristics of the Gutenberg Health Study population.

    No full text
    †<p>p-values calculated from a χ<sup>2</sup> test for smoking and diabetes (number of subjects), and from an F test for all others. Standard errors or percents of individuals are in parenthesis.</p

    Graphical models for equivalence classes tested among smoking, gene expression and atherosclerotic plaques.

    No full text
    <p>Variables are represented by squares and causal associations are indicated by directed edges among nodes. Undirected edges indicate bidirected edges. The two classes colored in brown represent the causal models of interest where gene expression (<i>G</i>) mediates the association between smoking (<i>S</i>) and plaques (<i>P</i>). In class (f), the covariation between <i>S</i> and <i>P</i> is entirely explained by <i>G</i>, whereas in class (k), there is residual covariation between <i>S</i> and <i>P</i> after conditioning on <i>G</i>.</p
    corecore