Mean and variance of gene expression distributions estimated from the EGFR signature and the TCGA breast cancer patient datasets. In TCGA, we used proteomics data of the patients, and binned the EGFR protein expression into 6 gradually increasing levels, partitioning all patients into 6 equal-sized groups. Mean and variances are estimated within each group. Up- and down-regulated genes are both EGFR signature genes derived by ASSIGN. The design and parameters for our simulation studies resemble the real estimates in these tables. Batch 1 represents the EGFR signature dataset with small gene variances, and a clear separation between the two condition groups in the expression of up-regulated genes. Batch 2 resembles the TCGA patient data with much larger variances than Batch 1. (XLSX 10 kb