3 research outputs found

    Power analysis for RNA sequencing and mass spectrometry-based proteomics data

    Get PDF
    RNA-sequencing and mass spectrometry technologies have facilitated the differential expression discoveries in transcriptome and proteome studies. However, the determination of sample size to achieve adequate statistical power has been a major challenge in experimental design. The objective of this study is to develop a power analysis tool applicable to both RNA-seq and MS-based proteomics data. The methods proposed in this study are capable of both prospective and retrospective power analyses. In terms of the performance, the benchmarking results indicated that the proposed methods can give distinct power estimates for both differentially and equivalently expressed genes or proteins without prior differential expression analysis and other assumptions, such as, expected fraction of differentially expressed features, minimal fold changes and expected mean expressions. Using the proposed methods, not only can researchers evaluate the reliability of their acquired significant results, but also estimate the sufficient sample size for a desired power. The proposed methods in this study were implemented as an R package, which can be freely accessed from Bioconductor project at http://bioconductor.org/packages/PowerExplorer/

    Conserved temporal ordering of promoter activation implicates common mechanisms governing the immediate early response across cell types and stimuli

    Get PDF
    The promoters of immediate early genes (IEGs) are rapidly activated in response to an external stimulus. These genes, also known as primary response genes, have been identified in a range of cell types, under diverse extracellular signals and using varying experimental protocols. Genomic dissection on a case-by-case basis has not resulted in a comprehensive catalogue of IEGs. I completed a rigorous meta-analysis of eight genome-wide FANTOM5 CAGE (cap analysis of gene expression) time-course datasets, and it revealed successive waves of promoter activation in IEGs, recapitulating known relationships between cell types and stimuli. I found a set of 57 (42 protein-coding) candidate IEGs possessing promoters that consistently drive a rapid but transient increase in expression following external stimulation. These genes show significant enrichment for known IEGs reported previously, pathways associated with the immediate early response, and include a number of non-coding RNAs with roles in proliferation and differentiation. There was strong conservation of the ordering of activation for these genes, such that 77 pairwise promoter activation orderings were conserved. Leveraging comprehensive CAGE time series data across cell types, I also observed extensive alternative promoter usage by such genes, which is likely to hinder their discovery from previous, smaller-scale studies. The common activation ordering of the core set of early-responding genes I identified may indicate conserved underlying regulatory mechanisms. By contrast, the considerably larger number of transiently activated genes that are specific to each cell type and stimulus illustrates the breadth of the primary response
    corecore