8 research outputs found

    NetMiner-an ensemble pipeline for building genome-wide and high-quality gene co-expression network using massive-scale RNA-seq samples

    No full text
    <div><p>Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at <a href="https://github.com/czllab/NetMiner" target="_blank">https://github.com/czllab/NetMiner</a>.</p></div

    Co-expression subnetwork derived from guide-gene approach for <i>XLOC_057324</i> associated with panicle development and fertility.

    No full text
    <p>Within the subnetwork, red nodes represented the experimentally verified genes related to corresponding biological functions. Chrysoidine nodes represented transcription factors. Pink nodes indicated the genes whose <i>Arabidopsis thaliana</i> homologues were experimentally verified to be related to corresponding biological functions. Yellow nodes represented that the genes were potential function-related. Green nodes denoted the lncRNA genes and gray nodes indicated that the genes were function unknown or annotated with unrelated functions.</p

    Subnetworks derived from the known <i>cis</i>-regulatory motif-guide approach.

    No full text
    <p>A) WTTSSCSS combined with the E2F transcription factors involved in cell cycle. B) TTGACY combined with the WRKY transcription factors involved in stress response. Within each subnetwork, red nodes represented the experimentally verified genes related to corresponding biological functions. Pink nodes indicated the genes whose <i>Arabidopsis thaliana</i> homologs were experimentally verified to be associated with the corresponding biological functions. Yellow nodes denoted the potential function-related genes. Gray nodes indicated that the genes with unknown functions or annotated with irrelevant functions. The size of node was proportional to the number of connected genes.</p

    Subnetworks derived from the gene-guide approach.

    No full text
    <p>The subnetworks included all other nodes within two-layer connections from guide genes. A) <i>OsMADS16</i> involved in flower development; B) <i>OsCESA4</i> involved in cell wall biosynthesis. Within each subnetwork, red nodes represented the experimentally verified genes related to corresponding biological functions. Pink nodes indicated the genes whose <i>Arabidopsis</i> homologs were experimentally verified relating to the corresponding biological processes. Yellow nodes represented the potential function-related genes, and gray nodes denoted that the genes with unknown functions or annotated with irrelevant functions. The size of node was proportional to the number of connected genes.</p

    Enrichment folds of different algorithms for co-expression network inference.

    No full text
    <p>A) Comparing to GGM for standard positive links. B) Comparing to WGCNA for standard positive links. C) Comparing with BC3NET for standard positive links. D) Comparing with GGM for standard negative links. E) Comparing with WGCNA for standard negative links. F) Comparing with BC3NET for standard negative links. In the legends, the RAW, FPKM, UQ, TMM, RLE and VST represented the networks obtained by the single RNA-seq data set; INT indicated intra-method consensus networks established by integrating the predictions of different RNA-seq data sets, EBM denoted high-quality gene co-expression network obtained by integrating all intra-method consensus networks.</p

    Performance evaluation of our network for predicting gene function.

    No full text
    <p>A) Receiver Operating Characteristics (ROC) curve. B) Precision-Recall (PR) curve. In the legends, Not-weighted indicated that the evaluation parameters were calculated by the standard method of CAFA project; Weighted indicated that the evaluation parameters were calculated by the weighted method of CAFA project.</p
    corecore