42 research outputs found

    The classification of breast cancer patients by the proposed tree model.

    No full text
    <p>In each plot, the considered predictors include all three negative MMSs and the most significant (or top) <i>k</i> (5 or 10) positive MMSs as summarized in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0112561#pone-0112561-t001" target="_blank">Table 1</a>. The purple, red and blue curves represent the predicted poor, good, and intermediate-prognosis groups, respectively.</p

    Somatic Mutations Favorable to Patient Survival Are Predominant in Ovarian Carcinomas

    No full text
    <div><p>Somatic mutation accumulation is a major cause of abnormal cell growth. However, some mutations in cancer cells may be deleterious to the survival and proliferation of the cancer cells, thus offering a protective effect to the patients. We investigated this hypothesis via a unique analysis of the clinical and somatic mutation datasets of ovarian carcinomas published by the Cancer Genome Atlas. We defined and screened 562 macro mutation signatures (MMSs) for their associations with the overall survival of 320 ovarian cancer patients. Each MMS measures the number of mutations present on the member genes (except for TP53) covered by a specific Gene Ontology (GO) term in each tumor. We found that somatic mutations favorable to the patient survival are predominant in ovarian carcinomas compared to those indicating poor clinical outcomes. Specially, we identified 19 (3) predictive MMSs that are, usually by a nonlinear dose-dependent effect, associated with good (poor) patient survival. The false discovery rate for the 19 “positive” predictors is at the level of 0.15. The GO terms corresponding to these MMSs include “lysosomal membrane” and “response to hypoxia”, each of which is relevant to the progression and therapy of cancer. Using these MMSs as features, we established a classification tree model which can effectively partition the training samples into three prognosis groups regarding the survival time. We validated this model on an independent dataset of the same disease (Log-rank p-value <2.3×10<sup>-4</sup>) and a dataset of breast cancer (Log-rank p-value <9.3×10<sup>−3</sup>). We compared the GO terms corresponding to these MMSs and those enriched with expression-based predictive genes. The analysis showed that the GO term pairs with large similarity are mainly pertinent to the proteins located on the cell organelles responsible for material transport and waste disposal, suggesting the crucial role of these proteins in cancer mortality.</p></div

    The illustration of the dose-dependent effect of somatic mutations on survival outcomes.

    No full text
    <p>Each plot demonstrates the relationship between the overall survival months and a specific macro mutation signature (MMS) that corresponds to a GO term. The purple curve represents the patients each of whom has at least two somatic mutations on the member genes of the indicated MMS (i.e., GO term). The red curve represents the patients each of whom has one somatic mutation on the member genes of the indicated MMS. The blue curve represents the patients without any somatic mutation on the member genes of the indicated MMS.</p

    The profile for the associations between the somatic mutations and survival time of patients with ovarian cancer.

    No full text
    <p>A (B): The Q-Q plot of the p-values from Log-rank test (Cox-PH regression) for the 562 considered MMSs. C: The volcano plot of the Cox-PH p-values and regression coefficients for the 562 considered MMSs. The horizontal dot line marks p = 0.05. D: The Venn diagram for the entire set of genes covered by the 22 selected MMSs. Specifically, the good (bad) genes are the genes involved in the GO terms corresponding to the 19 (3) positive (negative) MMSs which predict good (poor) clinical outcomes. A gene can belong to both the positive and negative MMSs, therefore may be double counted. E: The Venn diagram for the subset of the genes which are covered by the 22 selected MMSs. Each of the genes has the mutation burden in at least one training sample.</p

    Top SNP-induced gene network modules.

    No full text
    a<p>NCBI RefSeq ID of the genes that are cis- located with eQTL (sQTL) SNPs. <sup>b</sup> x indicates the association between the gene and a disease has been reported in literature as collected by DAVID <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0078868#pone.0078868-Huang2" target="_blank">[28]</a>.</p

    The summary of significant MMSs for the overall survival of patients with Ov-HGSCs.

    No full text
    <p><b>β</b>: the regression coefficients estimated by the Cox-PH model. <b>CP</b>: the composite p-value, which is the square root of the product of the Log-rank test p-value and the corresponding Cox-PH p-value. <b>N1</b>: the number of member genes covered by the corresponding MMS or GO term. <b>N2</b>: the number of mutated member genes present in at least one training sample. Note that a single gene can be covered by more than one GO term.</p><p>The summary of significant MMSs for the overall survival of patients with Ov-HGSCs.</p

    The classification of the training set of ovarian cancer patients by the proposed tree model.

    No full text
    <p>In each plot, the considered predictors include all three negative MMSs and the most significant (or top) <i>k</i> (5 or 10) positive MMSs as summarized in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0112561#pone-0112561-t001" target="_blank">Table 1</a>. The purple, red and blue curves represent the predicted poor, good, and intermediate-prognosis groups, respectively.</p

    Inferring Polymorphism-Induced Regulatory Gene Networks Active in Human Lymphocyte Cell Lines by Weighted Linear Mixed Model Analysis of Multiple RNA-Seq Datasets

    Get PDF
    <div><p>Single-nucleotide polymorphisms (SNPs) contribute to the between-individual expression variation of many genes. A regulatory (trait-associated) SNP is usually located near or within a (host) gene, possibly influencing the gene’s transcription or/and post-transcriptional modification. But its targets may also include genes that are physically farther away from it. A heuristic explanation of such multiple-target interferences is that the host gene transfers the SNP genotypic effects to the distant gene(s) by a transcriptional or signaling cascade. These connections between the host genes (regulators) and the distant genes (targets) make the genetic analysis of gene expression traits a promising approach for identifying unknown regulatory relationships. In this study, through a mixed model analysis of multi-source digital expression profiling for 140 human lymphocyte cell lines (LCLs) and the genotypes distributed by the international HapMap project, we identified 45 thousands of potential SNP-induced regulatory relationships among genes (the significance level for the underlying associations between expression traits and SNP genotypes was set at FDR < 0.01). We grouped the identified relationships into four classes (paradigms) according to the two different mechanisms by which the regulatory SNPs affect their cis- and trans- regulated genes, modifying mRNA level or altering transcript splicing patterns. We further organized the relationships in each class into a set of network modules with the cis- regulated genes as hubs. We found that the target genes in a network module were often characterized by significant functional similarity, and the distributions of the target genes in three out of the four networks roughly resemble a power-law, a typical pattern of gene networks obtained from mutation experiments. By two case studies, we also demonstrated that significant biological insights can be inferred from the identified network modules.</p></div

    The distribution profiles of eQTL SNPs and sQTL SNPs across different genomic regions.

    No full text
    <p>In plot <b>A</b>, the result was summarized according to the involved genes (RefSeq mRNAs). In plot <b>B</b>, the result was summarized according to the involved SNPs. In the bar charts, the quantities for the entire set of the eQTL (sQTL) SNPs are represented by black bars and the quantities for the tag-SNPs (gene-wide most significant SNPs) are represented by grey bars. U0-1K/D0-1K represents the 0-1 kilo-bases upper-/down- stream region of a RefSeq gene and U1-20K/D1-20K represents the 1−20 kilo-bases upper-/down- stream region of a RefSeq gene. Plots <b>C</b>-<b>D</b> are drawn for eQTLs and Plots <b>E</b>-<b>F</b> are drawn for sQTLs. In plots <b>C</b> and <b>E</b>, “proportion” represents the ratio of the number of eQTL (sQTL) SNPs in the corresponding region to the total number of eQTL (sQTL) SNPs. In plots <b>D</b> and <b>F</b>, “density index” is calculated by dividing the proportion of eQTL (sQTL) SNPs with the average length (in kilo-base) of the corresponding genomic region.</p
    corecore