34 research outputs found

    MOESM1 of Differential correlation for sequencing data

    No full text
    Additional file 1. Additional figures and tables

    The networks between the hub-genes whose degree changed from IL and PL to LHB.

    No full text
    The top row corresponds to the genes whose degree decreased from IL and PL to LHB, while the bottom row corresponds to the genes whose degree increased.</p

    The run-times of different methods (in seconds) with the genes left after pruning based on different CV cut-offs.

    No full text
    λ1 and λ2 were respectively kept at 0.01 and 0.02. The mark “X” means that we could not run those methods due to inordinate amount of time required.</p

    Limitations of clustering with PCA and correlated noise

    No full text
    It is now common to have a modest to large number of features on individuals with complex diseases. Unsupervised analyses, such as clustering with and without preprocessing by Principle Component Analysis (PCA), is widely used in practice to uncover subgroups in a sample. However, in many modern studies features are often highly correlated and noisy (e.g. SNP's, -omics, quantitative imaging markers, and electronic health record data). The practical performance of clustering approaches in these settings remains unclear. Through extensive simulations and empirical examples applying Gaussian Mixture Models and related clustering methods, we show these approaches (including variants of kmeans, VarSelLCM, HDClassifier, and Fisher-EM) can have very poor performance in many settings. We also show the poor performance is often driven by either an explicit or implicit assumption by the clustering algorithm that high variance features are relevant while lower variance features are irrelevant, called the variance as relevance assumption. We develop practical pre-processing approaches that improve analysis performance in some cases. This work offers practical guidance on the strengths and limitations of unsupervised clustering approaches in modern data analysis applications.</p

    WGCNA Network Summary

    No full text
    1<p>Significant association with alcohol consumption is defined as FDR <0.05 or Fisher's unadjusted p-value <0.01 for the association between module eigengene and alcohol consumption (Rodriguez et al., 1994; Phillips et al., 1994).</p

    Reproducibility of Candidate Modules and Conservation of Candidate Modules across Brain Regions and Whole Brain.

    No full text
    <p>Conservation of candidate coexpression modules across individual brain regions and whole brain is represented by a Z summary score (color scale: 0 (black) to 10 (bright red)) (Langfelder et al., 2011; see text). In this graphic, Z summary scores above 10 are truncated to 10. The coexpression modules on the vertical axis are followed by an abbreviation indicating the network from which the module is derived: wb, whole brain; cer, cerebellum; hip, hippocampus; na, nucleus accumbens; pfc, prefrontal cortex; str, striatum; vta, ventral tegmental area. For each module, the Z summary score for conservation within each of the other datasets is shown. In addition, the average bootstrapped Z summary score is illustrated for the dataset from which the module was originally derived (represents reproducibility of candidate module in its original dataset). *Average Z summary score for reproducibility is within one SD of 2.</p

    Characteristics of candidate modules associated with alcohol consumption.

    No full text
    <p>Candidate modules from all whole brain and each brain regional network are shown. The first column depicts the network from which the candidate module was derived and the second column is the module name. The direction of the correlation is not reported as these are unsigned networks. N/A indicates there were no common eQTLs among the probesets. The mQTL location reports the chromosome and Mb location for the highest peak.</p

    Eigengene Network.

    No full text
    <p>The eigengene network dendrogram was constructed based on a distance of (1-TOM) (see text). The red line ([1-TOM]  = 0.5) represents the criterion used for defining the meta-modules. Eigengenes colored grey were not assigned to a meta-module. The names of the candidate modules are followed by an abbreviation indicating the network from which these modules were derived: WB, whole brain; NA, nucleus accumbens; VTA, ventral tegmental area; PFC, prefrontal cortex, CER, cerebellum; HIP, hippocampus; STR, striatum.</p

    Flow Chart of Analysis Procedure for Whole Brain (A) and Brain Regional (B) Microarray Data.

    No full text
    <p>Whole brain microarray data were filtered for SNPs between C57BL/6 and DBA/2 mice, and for expression above background levels. The remaining probesets were subjected to WGCNA, and the resulting coexpression modules were filtered by correlation of eigengene with alcohol consumption data, followed by determination of overlap of mQTLs and alcohol bQTLs, to identify “candidate modules”. B. Microarray data for the indicated brain regions were obtained from GeneNetwork (<a href="http://www.genenetwork.org" target="_blank">www.genenetwork.org</a>), and subjected to WGCNA (using the same probesets as were used for the whole brain data). Candidate modules were identified and characterized within each network, and were used to create an eigengene network that demonstrates gene coexpression within and between brain regions.</p
    corecore