7 research outputs found
Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
This excel file contains comparison of resulting sample size and power between Li et al.’s method [18] and our proposed method for simulation 1, with parameter settings from Table 1 in [18]. The results are obtained under m=200, with Li’s result in the first row from each parameter setting, and our result in the second row. (XLS 49.2 kb
Dedicated transcriptomics combined with power analysis lead to functional understanding of genes with weak phenotypic changes in knockout lines
Author summary Knockout mice benefit the understanding of gene functions in mammals. However, it has proven difficult for many genes to identify clear phenotypes, related due to lack of sufficient assays. As Lewis Wolpert put it in a famous quote “But did you take them to the opera?”, thus metaphorically alluding to the need to extend phenotyping efforts. This insight led to the establishment of phenotyping pipelines that are nowadays routinely used to characterize knock-out lines. However, transcriptomic approaches based on RNA-Seq have been much less explored for such deep-level studies. We conducted here both, a theoretical power analysis and practical RNA-Seq experiments on two knockout lines with small phenotypic effects to investigate the parameters including sample size, sequencing depth, fold change, and dispersion. Our dedicated RNA-Seq studies discovered thousands of genes with small transcriptional changes and enriched in specific functions in both knockout lines. We find that it is more important to increase the number of samples than to increase the sequencing depth. Our work shows that a deep RNA-Seq study on knockouts is powerful for understanding gene functions in cases of weak phenotypic effects, and provides a guideline for the experimental design of such studies
Sample size calculations and normalization methods for RNA-seq data.
High-throughput RNA sequencing (RNA-seq) has become the preferred choice for transcriptomics and gene expression studies. With the rapid growth of RNA-seq applications, sample size calculation methods for RNA-seq experiment design and data normalization methods for DEG analysis are important issues to be explored and discussed. The underlying theme of this dissertation is to develop novel sample size calculation methods in RNA-seq experiment design using test statistics. I have also proposed two novel normalization methods for analysis of RNA-seq data. In chapter one, I present the test statistical methods including Wald’s test, log-transformed Wald’s test and likelihood ratio test statistics for RNA-seq data with a negative binomial distribution. Following the test statistics, I present the five sample calculation methods based on a one-sided test. A comparison of my five methods and an existing method was performed by calculating the sample sizes and the simulated power in different scenarios. Due to the limitations of these methods, in chapter two, I have further derived two explicit sample size calculation methods based on a generalized linear model with a negative binomial distribution in RNA-seq data. These two sample size methods based on a two-sided Wald’s test are presented under a wide range of settings including the imbalanced design and unequal read depth, which is applicable in many situations. In chapter 3, I have a literature review of the existing normalization methods and describe the challenge of choosing an optimal normalization method due to multiple factors contributing to read count variability that effect overall the sensitivity and specificity. Then, I present two proposed normalization methods. I evaluate the performance of the commonly used methods (DESeq, TMM-edgeR, FPKM-CuffDiff, TC, Med, UQ and FQ) and two new methods I propose: Med-pgQ2 and UQ-pgQ2. The results from MAQC2 data shows that my proposed Med-pgQ2 and UQ-pgQ2 methods may be better choices for the differential gene analysis of RNA-seq data by improving specificity while maintaining a good detection power given a nominal FDR level. Finally, in chapter 4, I focus on data analysis in RNA-seq data using three normalization methods and two test statistic method with the aid of DESeq2 and edgeR packages. Through within-group analysis of these real RNA-seq data, I have found my normalization method, UQ-pgQ2, performs best with a lower false positive rate while maintaining a good detection power. Thus, in my work, I have derived the explicit sample size calculation methods, which is a very useful tool for researchers to quickly estimate the sample sizes in an experiment design. Furthermore, my two normalization methods can improve the performance for differential gene analysis of RNA-seq data by controlling false positives for high read count genes
Recommended from our members
Genetic analyses, protein expression, type and function with clinical and immunological correlates in human papillomavirus associated head and neck neoplasia
Main supervisor:
Dr. Jane Sterling PhD FRCP
Internal examiner:
Dr. Lucy Truman PhD FRCS
External examiner:
Prof. Chris Nutting BSc (Hons) MD PhD FRCP FRCR MedFIPEm
Assessor to Regius Professor of Physic:
Dr. Chris Allen MA MD FRCPIn head and neck squamous cell carcinoma (HNSCC), the evidence that human papillomavirus (HPV) is associated with a subgroup of tumours has increased over the last thirty years. Prospective randomised controlled clinical trials have now established that detection of HPV in oropharyngeal tumours (~60-70% of cases in North America and Europe) may confer a survival advantage to the patient.
Within the oropharynx, HPV16 constitutes ~90-95% of HPV subtypes associated with malignancy. This contrasts with uterine cervix mucosa, where approximately 15 high-risk subtypes cause >99% of disease. Other factors such as differences in genetic background, host immune response, hormonal and environmental influences (e.g. tobacco smoke, alcohol) all play a part in the pathway and susceptibility to oncogenesis.
Analogous to uterine cervix disease, HPV-associated cancers involve wild-type TP53 whilst HPV negative tumours often have mutations in this gene.
As most patients with HPV associated OPSCC present at an advanced stage, the detection and genetic analysis of a pre-malignant state would be important as it infers the potential for a screening test (similar to the uterine cervix model). To investigate this, whole transcriptome analysis with verification of results by reverse transcription–quantitative polymerase chain reaction (PCR), was performed on OPSCC fresh tissue biopsy samples. Predictable fold changes of RNA expression in HPV-associated disease included multiple transcripts within the p53 oncogenic pathway (e.g. CDKN2A / CCND1). In addition to this, a testis-specific gene not normally expressed in somatic cells, SYCP2, showed a consistently elevated fold change from baseline in pre-malignant and malignant tissue.
A subtle immune defect has long been thought to trigger the susceptibility of some individuals to persistent HPV infection with either low or high risk HPV types. Following on from this, we investigated if clinical outcomes in HPV-related head and neck cancer may be affected by host immune response. Peripheral blood from patients with OPSCC treated by chemoradiotherapy underwent IFN-Îł enzyme-linked immunosorbent spot assay (ELISPOT) to examine cell-mediated immune responses to HPV16 E2, E6 and E7. T cell responses against E6 or E7 peptides correlated with HPV DNA/RNA status. Within the HPV16+ OPSCC cohort, enhanced immunoreactivity to antigen E7 was linked to improved survival. In addition, an observed increase in regulatory T cell frequencies after treatment would suggest that immunosuppression may contribute to a reduced HPV specific cell mediated response.
A further aspect of this study was to determine the role of HPV and Epstein Barr Virus (EBV) in the pathogenesis of squamous cell carcinoma within the temporal bone region. This is an uncommon tumour which is normally preceded by a history of inflammation within the external auditory canal (EAC) or middle ear / mastoid cavity. Although HPV has been implicated in many head and neck malignancies, its role in SCC of the temporal bone has not been established. Treatment strategies could change if a viral aetiology can be found. HPV16 DNA was detected in ~20% of the cases studied, however, no significant difference in disease specific survival was noted for the papillomavirus positive group. Epstein-Barr virus was not detected.
The data presented highlight the functional and biological influence of high risk HPV infection on HNSCC. Further studies are likely to focus on developing non-invasive screening tools, therapeutic strategies based on vaccination or immune modulation or indeed de-escalation of current treatment protocols.HPV associated OPSCC immunology analysis
Cancer Research UK (London, United Kingdom) 2012-07 to 2013-12|Award
GRANT_NUMBER: C45051/A14962
Total funding amount
GBP 15,500
H&N cancer HPV analysis
Addenbrooke's Charitable Trust, Cambridge University Hospitals (Cambridge, United Kingdom) 2011-07 to 2013-12|Award
GRANT_NUMBER: KDD/9478
Total funding amount
GBP 15,00