3 research outputs found
Power analysis for RNA sequencing and mass spectrometry-based proteomics data
RNA-sequencing and mass spectrometry technologies have facilitated the differential expression discoveries in transcriptome and proteome studies. However, the determination of sample size to achieve adequate statistical power has been a major challenge in experimental design. The objective of this study is to develop a power analysis tool applicable to both RNA-seq and MS-based proteomics data. The methods proposed in this study are capable of both prospective and retrospective power analyses. In terms of the performance, the benchmarking results indicated that the proposed methods can give distinct power estimates for both differentially and equivalently expressed genes or proteins without prior differential expression analysis and other assumptions, such as, expected fraction of differentially expressed features, minimal fold changes and expected mean expressions. Using the proposed methods, not only can researchers evaluate the reliability of their acquired significant results, but also estimate the sufficient sample size for a desired power. The proposed methods in this study were implemented as an R package, which can be freely accessed from Bioconductor project at http://bioconductor.org/packages/PowerExplorer/
Conserved temporal ordering of promoter activation implicates common mechanisms governing the immediate early response across cell types and stimuli
The promoters of immediate early genes (IEGs) are rapidly activated in
response to an external stimulus. These genes, also known as primary
response genes, have been identified in a range of cell types, under diverse
extracellular signals and using varying experimental protocols. Genomic
dissection on a case-by-case basis has not resulted in a comprehensive
catalogue of IEGs. I completed a rigorous meta-analysis of eight genome-wide
FANTOM5 CAGE (cap analysis of gene expression) time-course datasets, and
it revealed successive waves of promoter activation in IEGs, recapitulating
known relationships between cell types and stimuli. I found a set of 57 (42
protein-coding) candidate IEGs possessing promoters that consistently drive
a rapid but transient increase in expression following external stimulation.
These genes show significant enrichment for known IEGs reported previously,
pathways associated with the immediate early response, and include a number
of non-coding RNAs with roles in proliferation and differentiation. There was
strong conservation of the ordering of activation for these genes, such that 77
pairwise promoter activation orderings were conserved. Leveraging
comprehensive CAGE time series data across cell types, I also observed
extensive alternative promoter usage by such genes, which is likely to hinder
their discovery from previous, smaller-scale studies. The common activation
ordering of the core set of early-responding genes I identified may indicate
conserved underlying regulatory mechanisms. By contrast, the considerably
larger number of transiently activated genes that are specific to each cell type
and stimulus illustrates the breadth of the primary response