Additional file 2 of Alternative empirical Bayes models for adjusting for batch effects in genomic studies

Abstract

Mean and variance of gene expression distributions estimated from the EGFR signature and the TCGA breast cancer patient datasets. In TCGA, we used proteomics data of the patients, and binned the EGFR protein expression into 6 gradually increasing levels, partitioning all patients into 6 equal-sized groups. Mean and variances are estimated within each group. Up- and down-regulated genes are both EGFR signature genes derived by ASSIGN. The design and parameters for our simulation studies resemble the real estimates in these tables. Batch 1 represents the EGFR signature dataset with small gene variances, and a clear separation between the two condition groups in the expression of up-regulated genes. Batch 2 resembles the TCGA patient data with much larger variances than Batch 1. (XLSX 10 kb

    Similar works

    Full text

    thumbnail-image

    Available Versions