41 research outputs found

    New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer

    No full text
    We focus on characterizing common and different coexpression patterns among RNAs and proteins in breast cancer tumors. To address this problem, we introduce Joint Random Forest (JRF), a novel nonparametric algorithm to simultaneously estimate multiple coexpression networks by effectively borrowing information across protein and gene expression data. The performance of JRF was evaluated through extensive simulation studies using different network topologies and data distribution functions. Advantages of JRF over other algorithms that estimate class-specific networks separately were observed across all simulation settings. JRF also outperformed a competing method based on Gaussian graphic models. We then applied JRF to simultaneously construct gene and protein coexpression networks based on protein and RNAseq data from CPTAC-TCGA breast cancer study. We identified interesting common and differential coexpression patterns among genes and proteins. This information can help to cast light on the potential disease mechanisms of breast cancer

    Additional file 2: of Inter-tissue coexpression network analysis reveals DPP4 as an important gene in heart to blood communication

    No full text
    Supporting notes. Figure S1. The optimal numbers of principal components (PCs) to correct in each tissue. Figure S2. Histograms of correlation coefficients between sample ischemic time and RINs with gene expression profiles in nine tissues. Red lines are for correlation with RINs, and blue lines are for correlation with sample ischemic time. Solid lines are for empirical gene expression profiles in the study, dashed lines are for permuted data. (DOCX 500 kb

    Sample alignment with MODMatcher.

    No full text
    <p>Initial labels of samples are used to determine cis pairs, which are then used to calculate similarity scores. Based on the similarity scores determined with three data types, the molecular data are matched with each other (1) by gender, (2) by cis-eSNPs, (3) by cis-mSNPs, (4) by cis mRNA-methylation pairs, and (5) by all trio mapping. Then, updated sample pairs are used to calculate new cis pairs for another round of alignment. Rounds of alignment are repeated until there are no further changes.</p

    MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis

    No full text
    <div><p>Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Multi-Omics Data Matcher (MODMatcher), to identify and correct sample labeling errors in multiple types of molecular data, which can be used in further integrative analysis. Our results indicate that inspection of sample annotation and labeling error is an indispensable data quality assurance step. Applied to a large lung genomic study, MODMatcher increased statistically significant genetic associations and genomic correlations by more than two-fold. In a simulation study, MODMatcher provided more robust results by using three types of omics data than two types of omics data. We further demonstrate that MODMatcher can be broadly applied to large genomic data sets containing multiple types of omics data, such as The Cancer Genome Atlas (TCGA) data sets.</p></div

    Examples of sample alignment in the TCGA BRCA data set.

    No full text
    <p>(A) A similarity score distribution of a correctly labeled profile. The red star indicates the similarity score between self-matched profile pairs (gene expression and methylation data profiles are labeled as pertaining to the same sample). (B) Similarity scores of self-matched pairs (red stars) between gene expression and methylation profiles for two samples are lower than the similarity scores of cross-matched pairs (blue stars).</p

    Gender prediction based on expression of the Y-chromosome specific gene <i>RPS4Y1</i>.

    No full text
    <p>The log2 transformed values of <i>RPS4Y1</i> expression level are clearly separated between male and female samples both in CTRL and patients with COPD (>10 in male samples and <10 in female samples). There were no gender mismatched samples in the CTRL and 5 mismatched samples (2 in females and 3 in males) in the COPD set (error rate of 1.5%).</p

    Relationship between metabolites and genes linked to eQTL hot spot 2 on Chromosome V.

    No full text
    <p>(A) De novo biosynthesis of pyrimidine pathway; (B) orotic acid and dihydroorotic acid concentrations are linked to the <i>URA3</i> locus; (C) <i>URA3</i> is predicted as the causal regulator for genes and metabolites linked to the eQTL hot spot. Red nodes are genes or metabolites whose variations are linked the Chromosome V locus. The shapes of the nodes follow the convention described in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001301#pbio-1001301-g003" target="_blank">Figure 3</a>.</p

    Overview of the experimental design.

    No full text
    <p>A cross between laboratory (BY) and wild (RM) strains of <i>S. cerevisiae </i><a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001301#pbio.1001301-Brem1" target="_blank">[11]</a> was gene expression profiled. Metabolites were profiled under the same conditions. These data were then integrated with genotype data along with information from public databases to derive a BN. The derived network was used to analyze how cells are regulated.</p

    Genes and metabolites linked to eQTL hot spot 3 on Chromosome XIII.

    No full text
    <p>(A) Variations of the metabolites isoleucine and threonine are linked to this locus. (B) These two subnetworks comprise genes and metabolites enriched for linking to the Chromosome XIII locus. The larger network consists of both gene expression and metabolite nodes enriched for the GO biological process nitrogen compound metabolism. The smaller network is enriched for the GO biological process de novo IMP biosynthetic process. Red nodes are genes with eQTLs linked to the Chromosome 13 locus. (C) Expression levels of eight genes (in red) are different between <i>VPS9</i> knockout and the wild-type strains. The shapes of the nodes follow the convention described in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001301#pbio-1001301-g003" target="_blank">Figure 3</a>.</p
    corecore