17 research outputs found
Copy-number aware methylation deconvolution analysis of cancers
DNA methylation has long been known to play a role in tumourigenesis. To this day, interpretation of bulk tumour bisulphite sequencing data has been hampered by normal contamination levels and tumour copy number. To address this issue, we develop two computational tools: (1) ASCAT.m, which allows Allele-Specific Copy number Analysis of Tumour methylation data directly from bulk tumour reduced representation bisulphite sequencing (RRBS) data and (2) CAMDAC, a method for Copy Number-Aware Methylation Deconvolution Analysis of Cancer, from bulk tumour and adjacent normal RRBS data. We describe a set of rules to compute allelic imbalance independently of bisulphite conversion and correct normalised read coverage estimates for sequencing biases. We apply ASCAT.m to non-small cell lung cancers from the epiTRACERx study with multi-region bulk tumour RRBS and adjacent normal. ASCAT.m genotypes, allele-specific copy numbers and tumour purity and ploidy estimates are in excellent agreement with those obtained from matched whole-exome and a subset of whole-genome sequencing of the same samples. We observe a correlation between whole-genome doubling and relapse-free survival in lung squamous cell carcinoma but not in adenocarcinoma. We see widespread genomic instability across both histological subtypes. We model bulk tumour methylation rates as a mixture of tumour and normal signals weighed for tumour purity and copy number and formalise this relationship into CAMDAC equations. The errors between predicted and observed methylation rates were low. Normal infiltrates Fluorescence-activated cell sorting (FACS)-purified from the bulk tumour were similar in composition to the adjacent matched normal lung, suggesting the latter is a suitable proxy for deconvolution. Single nucleotide variant (SNV)- and FACS-purified tumour methylation rates are in good agreement with CAMDAC deconvoluted estimates. Purification successfully removes shared normal signal, decreasing correlations between patients and to normal after purification. Samples with shared ancestry remain highly correlated. Purified methylation rates yield accurate tumour-normal and tumour-tumour differential methylation calls independent of tumour purity and copy number. We find hundreds of ubiquitously early clonal gene promoter epimutations across the epiTRACERx cohort, showcasing the potential of DNA methylation markers for early cancer detection. CAMDAC purified profiles reveal both phylogenetic and inter-tumour relationships as well as provide insight in tumour evolutionary history. Quantifying allele-specific methylation on chromosome X in females, we uncover extraction biases against the Barr body. X inactivation is random at the scale of our normal lung cancer samples. Phasing of methylation rates with polymorphisms confirms the presence of allele-specific methylation at the H19/IGF2 locus. Loss of imprinting is observed in 5 tumours, all involving demethylation of the maternal allele. We attempt to quantify the ratio of clonal allele-specific to bi-allelic epimutations in tumours in regions of 1+1, which we define as regulatory and stochastic methylation changes, respectively. Utilising this ratio, we try to extract the number of stochastic epimutations in regions of 2+0 with copy numbers 1 and 2 and time those copy number gains. We find that SNVs at gene promoters often lead to hypermethylation of neighbouring CpGs on the same copy or allele, suggesting the ablation of a transcription factor binding site. Non-expressed neo-antigen are enriched for promoter hypermethylation, indicating methylation plays a role in immune escape. To conclude, CAMDAC purified methylation rates are key to unlock insights into comparative cancer epigenomics and intra-tumour epigenetic heterogeneity
The Personal Genome Project-UK, an open access resource of human multi-omics data
Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics
Genomic-Transcriptomic Evolution in Lung Cancer and Metastasis
Intratumour heterogeneity (ITH) fuels lung cancer evolution, which leads to immune evasion and resistance to therapy1. Here, using paired whole-exome and RNA sequencing data, we investigate intratumour transcriptomic diversity in 354 non-small cell lung cancer tumours from 347 out of the first 421 patients prospectively recruited into the TRACERx study2,3. Analyses of 947 tumour regions, representing both primary and metastatic disease, alongside 96 tumour-adjacent normal tissue samples implicate the transcriptome as a major source of phenotypic variation. Gene expression levels and ITH relate to patterns of positive and negative selection during tumour evolution. We observe frequent copy number-independent allele-specific expression that is linked to epigenomic dysfunction. Allele-specific expression can also result in genomic-transcriptomic parallel evolution, which converges on cancer gene disruption. We extract signatures of RNA single-base substitutions and link their aetiology to the activity of the RNA-editing enzymes ADAR and APOBEC3A, thereby revealing otherwise undetected ongoing APOBEC activity in tumours. Characterizing the transcriptomes of primary-metastatic tumour pairs, we combine multiple machine-learning approaches that leverage genomic and transcriptomic variables to link metastasis-seeding potential to the evolutionary context of mutations and increased proliferation within primary tumour regions. These results highlight the interplay between the genome and transcriptome in influencing ITH, lung cancer evolution and metastasis
Evolutionary Characterization of Lung Adenocarcinoma Morphology in TRACERx
Lung adenocarcinomas (LUADs) display a broad histological spectrum from low-grade lepidic tumors through to mid-grade acinar and papillary and high-grade solid, cribriform and micropapillary tumors. How morphology reflects tumor evolution and disease progression is poorly understood. Whole-exome sequencing data generated from 805 primary tumor regions and 121 paired metastatic samples across 248 LUADs from the TRACERx 421 cohort, together with RNA-sequencing data from 463 primary tumor regions, were integrated with detailed whole-tumor and regional histopathological analysis. Tumors with predominantly high-grade patterns showed increased chromosomal complexity, with higher burden of loss of heterozygosity and subclonal somatic copy number alterations. Individual regions in predominantly high-grade pattern tumors exhibited higher proliferation and lower clonal diversity, potentially reflecting large recent subclonal expansions. Co-occurrence of truncal loss of chromosomes 3p and 3q was enriched in predominantly low-/mid-grade tumors, while purely undifferentiated solid-pattern tumors had a higher frequency of truncal arm or focal 3q gains and SMARCA4 gene alterations compared with mixed-pattern tumors with a solid component, suggesting distinct evolutionary trajectories. Clonal evolution analysis revealed that tumors tend to evolve toward higher-grade patterns. The presence of micropapillary pattern and \u27tumor spread through air spaces\u27 were associated with intrathoracic recurrence, in contrast to the presence of solid/cribriform patterns, necrosis and preoperative circulating tumor DNA detection, which were associated with extra-thoracic recurrence. These data provide insights into the relationship between LUAD morphology, the underlying evolutionary genomic landscape, and clinical and anatomical relapse risk
The evolution of lung cancer and impact of subclonal selection in TRACERx
Lung cancer is the leading cause of cancer-associated mortality worldwide. Here we analysed 1,644 tumour regions sampled at surgery or during follow-up from the first 421 patients with non-small cell lung cancer prospectively enrolled into the TRACERx study. This project aims to decipher lung cancer evolution and address the primary study endpoint: determining the relationship between intratumour heterogeneity and clinical outcome. In lung adenocarcinoma, mutations in 22 out of 40 common cancer genes were under significant subclonal selection, including classical tumour initiators such as TP53 and KRAS. We defined evolutionary dependencies between drivers, mutational processes and whole genome doubling (WGD) events. Despite patients having a history of smoking, 8% of lung adenocarcinomas lacked evidence of tobacco-induced mutagenesis. These tumours also had similar detection rates for EGFR mutations and for RET, ROS1, ALK and MET oncogenic isoforms compared with tumours in never-smokers, which suggests that they have a similar aetiology and pathogenesis. Large subclonal expansions were associated with positive subclonal selection. Patients with tumours harbouring recent subclonal expansions, on the terminus of a phylogenetic branch, had significantly shorter disease-free survival. Subclonal WGD was detected in 19% of tumours, and 10% of tumours harboured multiple subclonal WGDs in parallel. Subclonal, but not truncal, WGD was associated with shorter disease-free survival. Copy number heterogeneity was associated with extrathoracic relapse within 1 year after surgery. These data demonstrate the importance of clonal expansion, WGD and copy number instability in determining the timing and patterns of relapse in non-small cell lung cancer and provide a comprehensive clinical cancer evolutionary data resource
The evolution of non-small cell lung cancer metastases in TRACERx
Metastatic disease is responsible for the majority of cancer-related deaths. We report the longitudinal evolutionary analysis of 126 non-small cell lung cancer (NSCLC) tumours from 421 prospectively recruited patients in TRACERx who developed metastatic disease, compared with a control cohort of 144 non-metastatic tumours. In 25% of cases, metastases diverged early, before the last clonal sweep in the primary tumour, and early divergence was enriched for patients who were smokers at the time of initial diagnosis. Simulations suggested that early metastatic divergence more frequently occurred at smaller tumour diameters (less than 8 mm). Single-region primary tumour sampling resulted in 83% of late divergence cases being misclassified as early, highlighting the importance of extensive primary tumour sampling. Polyclonal dissemination, which was associated with extrathoracic disease recurrence, was found in 32% of cases. Primary lymph node disease contributed to metastatic relapse in less than 20% of cases, representing a hallmark of metastatic potential rather than a route to subsequent recurrences/disease progression. Metastasis-seeding subclones exhibited subclonal expansions within primary tumours, probably reflecting positive selection. Our findings highlight the importance of selection in metastatic clone evolution within untreated primary tumours, the distinction between monoclonal versus polyclonal seeding in dictating site of recurrence, the limitations of current radiological screening approaches for early diverging tumours and the need to develop strategies to target metastasis-seeding subclones before relapse
Genomic–transcriptomic evolution in lung cancer and metastasis
Intratumour heterogeneity (ITH) fuels lung cancer evolution, which leads to immune evasion and resistance to therapy. Here, using paired whole-exome and RNA sequencing data, we investigate intratumour transcriptomic diversity in 354 non-small cell lung cancer tumours from 347 out of the first 421 patients prospectively recruited into the TRACERx study. Analyses of 947 tumour regions, representing both primary and metastatic disease, alongside 96 tumour-adjacent normal tissue samples implicate the transcriptome as a major source of phenotypic variation. Gene expression levels and ITH relate to patterns of positive and negative selection during tumour evolution. We observe frequent copy number-independent allele-specific expression that is linked to epigenomic dysfunction. Allele-specific expression can also result in genomic–transcriptomic parallel evolution, which converges on cancer gene disruption. We extract signatures of RNA single-base substitutions and link their aetiology to the activity of the RNA-editing enzymes ADAR and APOBEC3A, thereby revealing otherwise undetected ongoing APOBEC activity in tumours. Characterizing the transcriptomes of primary–metastatic tumour pairs, we combine multiple machine-learning approaches that leverage genomic and transcriptomic variables to link metastasis-seeding potential to the evolutionary context of mutations and increased proliferation within primary tumour regions. These results highlight the interplay between the genome and transcriptome in influencing ITH, lung cancer evolution and metastasis
Antibodies against endogenous retroviruses promote lung cancer immunotherapy
B cells are frequently found in the margins of solid tumours as organized follicles in ectopic lymphoid organs called tertiary lymphoid structures (TLS). Although TLS have been found to correlate with improved patient survival and response to immune checkpoint blockade (ICB), the underlying mechanisms of this association remain elusive. Here we investigate lung-resident B cell responses in patients from the TRACERx 421 (Tracking Non-Small-Cell Lung Cancer Evolution Through Therapy) and other lung cancer cohorts, and in a recently established immunogenic mouse model for lung adenocarcinoma. We find that both human and mouse lung adenocarcinomas elicit local germinal centre responses and tumour-binding antibodies, and further identify endogenous retrovirus (ERV) envelope glycoproteins as a dominant anti-tumour antibody target. ERV-targeting B cell responses are amplified by ICB in both humans and mice, and by targeted inhibition of KRAS(G12C) in the mouse model. ERV-reactive antibodies exert anti-tumour activity that extends survival in the mouse model, and ERV expression predicts the outcome of ICB in human lung adenocarcinoma. Finally, we find that effective immunotherapy in the mouse model requires CXCL13-dependent TLS formation. Conversely, therapeutic CXCL13 treatment potentiates anti-tumour immunity and synergizes with ICB. Our findings provide a possible mechanistic basis for the association of TLS with immunotherapy response
Neoantigen-directed immune escape in lung cancer evolution
The interplay between an evolving cancer and a dynamic immune microenvironment remains unclear. Here we analyse 258 regions from 88 early-stage, untreated non-small-cell lung cancers using RNA sequencing and histopathology-assessed tumour-infiltrating lymphocyte estimates. Immune infiltration varied both between and within tumours, with different mechanisms of neoantigen presentation dysfunction enriched in distinct immune microenvironments. Sparsely infiltrated tumours exhibited a waning of neoantigen editing during tumour evolution, indicative of historical immune editing, or copy-number loss of previously clonal neoantigens. Immune-infiltrated tumour regions exhibited ongoing immunoediting, with either loss of heterozygosity in human leukocyte antigens or depletion of expressed neoantigens. We identified promoter hypermethylation of genes that contain neoantigenic mutations as an epigenetic mechanism of immunoediting. Our results suggest that the immune microenvironment exerts a strong selection pressure in early-stage, untreated non-small-cell lung cancers that produces multiple routes to immune evasion, which are clinically relevant and forecast poor disease-free survival
Genomic–transcriptomic evolution in lung cancer and metastasis
Intratumour heterogeneity (ITH) fuels lung cancer evolution, which leads to immune evasion and resistance to therapy1. Here, using paired whole-exome and RNA sequencing data, we investigate intratumour transcriptomic diversity in 354 non-small cell lung cancer tumours from 347 out of the first 421 patients prospectively recruited into the TRACERx study2,3. Analyses of 947 tumour regions, representing both primary and metastatic disease, alongside 96 tumour-adjacent normal tissue samples implicate the transcriptome as a major source of phenotypic variation. Gene expression levels and ITH relate to patterns of positive and negative selection during tumour evolution. We observe frequent copy number-independent allele-specific expression that is linked to epigenomic dysfunction. Allele-specific expression can also result in genomic–transcriptomic parallel evolution, which converges on cancer gene disruption. We extract signatures of RNA single-base substitutions and link their aetiology to the activity of the RNA-editing enzymes ADAR and APOBEC3A, thereby revealing otherwise undetected ongoing APOBEC activity in tumours. Characterizing the transcriptomes of primary–metastatic tumour pairs, we combine multiple machine-learning approaches that leverage genomic and transcriptomic variables to link metastasis-seeding potential to the evolutionary context of mutations and increased proliferation within primary tumour regions. These results highlight the interplay between the genome and transcriptome in influencing ITH, lung cancer evolution and metastasis