Search CORE

41 research outputs found

New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer

Author: Francesca Petralia (1444093)
Pei Wang (102036)
Won-Min Song (178326)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

We focus on characterizing common and different coexpression patterns among RNAs and proteins in breast cancer tumors. To address this problem, we introduce Joint Random Forest (JRF), a novel nonparametric algorithm to simultaneously estimate multiple coexpression networks by effectively borrowing information across protein and gene expression data. The performance of JRF was evaluated through extensive simulation studies using different network topologies and data distribution functions. Advantages of JRF over other algorithms that estimate class-specific networks separately were observed across all simulation settings. JRF also outperformed a competing method based on Gaussian graphic models. We then applied JRF to simultaneously construct gene and protein coexpression networks based on protein and RNAseq data from CPTAC-TCGA breast cancer study. We identified interesting common and differential coexpression patterns among genes and proteins. This information can help to cast light on the potential disease mechanisms of breast cancer

FigShare

Additional file 4: Table S3. of Inter-tissue coexpression network analysis reveals DPP4 as an important gene in heart to blood communication

Author: Carmen Argmann (256253)
Jun Zhu (84054)
Quan Long (342654)
Sander Houten (3567542)
Siwu Peng (3567545)
Tao Huang (110613)
Yong Zhao (84950)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

Number of significant gene-modules identified for each tissue pair. (PDF 42 kb

FigShare

Additional file 2: of Inter-tissue coexpression network analysis reveals DPP4 as an important gene in heart to blood communication

Author: Carmen Argmann (256253)
Jun Zhu (84054)
Quan Long (342654)
Sander Houten (3567542)
Siwu Peng (3567545)
Tao Huang (110613)
Yong Zhao (84950)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

Supporting notes. Figure S1. The optimal numbers of principal components (PCs) to correct in each tissue. Figure S2. Histograms of correlation coefficients between sample ischemic time and RINs with gene expression profiles in nine tissues. Red lines are for correlation with RINs, and blue lines are for correlation with sample ischemic time. Solid lines are for empirical gene expression profiles in the study, dashed lines are for permuted data. (DOCX 500 kb

FigShare

Sample alignment with MODMatcher.

Author: Avrum Spira (5655)
Charles A. Powell (481514)
Eric E. Schadt (50182)
Eunjee Lee (91487)
Joshua D. Campbell (615868)
Jun Zhu (84054)
Mark W. Geraci (615869)
Seungyeul Yoo (615867)
Tao Huang (110613)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

Initial labels of samples are used to determine cis pairs, which are then used to calculate similarity scores. Based on the similarity scores determined with three data types, the molecular data are matched with each other (1) by gender, (2) by cis-eSNPs, (3) by cis-mSNPs, (4) by cis mRNA-methylation pairs, and (5) by all trio mapping. Then, updated sample pairs are used to calculate new cis pairs for another round of alignment. Rounds of alignment are repeated until there are no further changes.</p

FigShare

MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis

Author: Avrum Spira (5655)
Charles A. Powell (481514)
Eric E. Schadt (50182)
Eunjee Lee (91487)
Joshua D. Campbell (615868)
Jun Zhu (84054)
Mark W. Geraci (615869)
Seungyeul Yoo (615867)
Tao Huang (110613)
Zhidong Tu (15201)
Publication venue
Publication date: 01/08/2014
Field of study

<div>Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Multi-Omics Data Matcher (MODMatcher), to identify and correct sample labeling errors in multiple types of molecular data, which can be used in further integrative analysis. Our results indicate that inspection of sample annotation and labeling error is an indispensable data quality assurance step. Applied to a large lung genomic study, MODMatcher increased statistically significant genetic associations and genomic correlations by more than two-fold. In a simulation study, MODMatcher provided more robust results by using three types of omics data than two types of omics data. We further demonstrate that MODMatcher can be broadly applied to large genomic data sets containing multiple types of omics data, such as The Cancer Genome Atlas (TCGA) data sets.</div

Directory of Open Access Journals

PubMed Central

FigShare

Examples of sample alignment in the TCGA BRCA data set.

Author: Avrum Spira (5655)
Charles A. Powell (481514)
Eric E. Schadt (50182)
Eunjee Lee (91487)
Joshua D. Campbell (615868)
Jun Zhu (84054)
Mark W. Geraci (615869)
Seungyeul Yoo (615867)
Tao Huang (110613)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

(A) A similarity score distribution of a correctly labeled profile. The red star indicates the similarity score between self-matched profile pairs (gene expression and methylation data profiles are labeled as pertaining to the same sample). (B) Similarity scores of self-matched pairs (red stars) between gene expression and methylation profiles for two samples are lower than the similarity scores of cross-matched pairs (blue stars).</p

FigShare

Gender prediction based on expression of the Y-chromosome specific gene RPS4Y1.

Author: Avrum Spira (5655)
Charles A. Powell (481514)
Eric E. Schadt (50182)
Eunjee Lee (91487)
Joshua D. Campbell (615868)
Jun Zhu (84054)
Mark W. Geraci (615869)
Seungyeul Yoo (615867)
Tao Huang (110613)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

The log2 transformed values of RPS4Y1 expression level are clearly separated between male and female samples both in CTRL and patients with COPD (>10 in male samples and <10 in female samples). There were no gender mismatched samples in the CTRL and 5 mismatched samples (2 in females and 3 in males) in the COPD set (error rate of 1.5%).</p

FigShare

Relationship between metabolites and genes linked to eQTL hot spot 2 on Chromosome V.

Author: Eric E. Schadt (50182)
Ethan Y. Xu (173710)
Heather Vu (173715)
Jun Zhu (84054)
Kenneth M. Dombek (173704)
Pavel Sova (13422)
Qiuwei Xu (173697)
Rachel B. Brem (114263)
Roger E. Bumgarner (173719)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

(A) De novo biosynthesis of pyrimidine pathway; (B) orotic acid and dihydroorotic acid concentrations are linked to the URA3 locus; (C) URA3 is predicted as the causal regulator for genes and metabolites linked to the eQTL hot spot. Red nodes are genes or metabolites whose variations are linked the Chromosome V locus. The shapes of the nodes follow the convention described in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001301#pbio-1001301-g003" target="_blank">Figure 3</a>.</p

FigShare

Overview of the experimental design.

Author: Eric E. Schadt (50182)
Ethan Y. Xu (173710)
Heather Vu (173715)
Jun Zhu (84054)
Kenneth M. Dombek (173704)
Pavel Sova (13422)
Qiuwei Xu (173697)
Rachel B. Brem (114263)
Roger E. Bumgarner (173719)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

A cross between laboratory (BY) and wild (RM) strains of S. cerevisiae <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001301#pbio.1001301-Brem1" target="_blank">[11]</a> was gene expression profiled. Metabolites were profiled under the same conditions. These data were then integrated with genotype data along with information from public databases to derive a BN. The derived network was used to analyze how cells are regulated.</p

FigShare

Genes and metabolites linked to eQTL hot spot 3 on Chromosome XIII.

Author: Eric E. Schadt (50182)
Ethan Y. Xu (173710)
Heather Vu (173715)
Jun Zhu (84054)
Kenneth M. Dombek (173704)
Pavel Sova (13422)
Qiuwei Xu (173697)
Rachel B. Brem (114263)
Roger E. Bumgarner (173719)
Zhidong Tu (15201)
Publication venue
Publication date
Field of study

(A) Variations of the metabolites isoleucine and threonine are linked to this locus. (B) These two subnetworks comprise genes and metabolites enriched for linking to the Chromosome XIII locus. The larger network consists of both gene expression and metabolite nodes enriched for the GO biological process nitrogen compound metabolism. The smaller network is enriched for the GO biological process de novo IMP biosynthetic process. Red nodes are genes with eQTLs linked to the Chromosome 13 locus. (C) Expression levels of eight genes (in red) are different between VPS9 knockout and the wild-type strains. The shapes of the nodes follow the convention described in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001301#pbio-1001301-g003" target="_blank">Figure 3</a>.</p

FigShare