7 research outputs found

    Drug Repositioning through Systematic Mining of Gene Coexpression Networks in Cancer

    No full text
    <div><p>Gene coexpression network analysis is a powerful “data-driven” approach essential for understanding cancer biology and mechanisms of tumor development. Yet, despite the completion of thousands of studies on cancer gene expression, there have been few attempts to normalize and integrate co-expression data from scattered sources in a concise “meta-analysis” framework. We generated such a resource by exploring gene coexpression networks in 82 microarray datasets from 9 major human cancer types. The analysis was conducted using an elaborate weighted gene coexpression network (WGCNA) methodology and identified over 3,000 robust gene coexpression modules. The modules covered a range of known tumor features, such as proliferation, extracellular matrix remodeling, hypoxia, inflammation, angiogenesis, tumor differentiation programs, specific signaling pathways, genomic alterations, and biomarkers of individual tumor subtypes. To prioritize genes with respect to those tumor features, we ranked genes within each module by connectivity, leading to identification of module-specific functionally prominent hub genes. To showcase the utility of this network information, we positioned known cancer drug targets within the coexpression networks and predicted that Anakinra, an anti-rheumatoid therapeutic agent, may be promising for development in colorectal cancer. We offer a comprehensive, normalized and well documented collection of >3000 gene coexpression modules in a variety of cancers as a rich data resource to facilitate further progress in cancer research.</p></div

    Modules in a GSE20865 breast cancer dataset.

    No full text
    <p>GSE20865 was the largest breast cancer dataset analyzed here and includes 327 patients. The coexpression network identified 50 modules in this dataset. This heatmap displays expression patterns of genes in each module: with genes in rows and patients in columns. The modules larger than 250 genes (M1—M4) are represented by only the top 250 highly connected genes (to facilitate compact visualization). For selected modules, key biological functions are specified, with corresponding enrichment P-values.</p

    Gene connectivity in the proliferation module: highly connected genes are associated with relevant biology and poor survival prognosis.

    No full text
    <p>Figures A and B correspond to the GSE20685 dataset (the largest breast cancer dataset in our study); C and D–to GSE21653 (the second largest dataset). A and C: proportion of genes related to the cell cycle GO process in a 50-gene window sliding from lowly to highly connected genes. B and D: scatter plots for gene connectivity against the power of a gene to predictive survival. The gene predictive power was defined as–log(P) from Cox univariate survival regression. Spearman correlations and statistical significance values are shown.</p

    Module enrichments with chromosomal cytobands.

    No full text
    <p>Numbers on the outer side of the circle are chromosomes. Coordinates within each chromosome are genomic coordinates. Bar height on the inner side of the circle is proportional to number of modules from a given cancer type enriched with a respective cytoband at P < 10<sup>−3</sup>. Dark red: breast cancer; red: colon cancer; magenta: glioma; pink: lung cancer; orange: ovarian cancer; yellow: prostate cancer; brown: kidney cancer; dark green: gastric cancer; light green: melanoma. Visualization was produced using Circos software [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0165059#pone.0165059.ref050" target="_blank">50</a>].</p

    Workflow overview.

    No full text
    <p>In each dataset, the following workflow was applied. 1. The dataset was used as a starting point to construct a gene coexpression network based on Topological Overlap between genes. TO determines similarity between gene expression profiles taking into account a systems level context. The network was next hierarchically clustered, resulting in a cluster dendrogram. 2. Using DynamicTreeCut algorithm, branches were identified in the dendrogram, leading to identification of gene coexpression modules. 3. Genes in each module were further prioritized by intramodular connectivity, providing a distinction between lowly and highly connected genes. The entire workflow was repeated independently for 82 datasets, resulting in a set of gene coexpression modules in each of them.</p

    Cross-dataset high level functional landscape.

    No full text
    <p>This heatmap displays associations between gene coexpression modules and biological processes across all the datasets. Color denotes enrichment of a given module with a biological process: hypergeometric log p-value after Benjamini-Hochberg adjustment. Cluster height reflects how many interrelated processes are associated with the given module set: the higher a cluster–the broader is the module-associated functional theme. Cluster width reflects how many modules are sharing this function: the wider a cluster–the more frequently this function is found in the GEO datasets. For major clusters, key biological themes are subscribed. The heatmap includes 1,240 biological processes and 668 modules, which were selected as follows. A GO process was included if it’s associated 3 or more coexpression modules (P < 0.001). A module was included if it’s enriched with 3 or more biological process terms (P < 0.001).</p
    corecore