53 research outputs found
Integrative modeling identifies genetic ancestry-associated molecular correlates in human cancer
Cellular and molecular aberrations contribute to the disparity of human cancer incidence and etiology between ancestry groups. Multiomics profiling in The Cancer Genome Atlas (TCGA) allows for querying of the molecular underpinnings of ancestry-specific discrepancies in human cancer. Here, we provide a protocol for integrative associative analysis of ancestry with molecular correlates, including somatic mutations, DNA methylation, mRNA transcription, miRNA transcription, and pathway activity, using TCGA data. This protocol can be generalized to analyze other cancer cohorts and human diseases. For complete details on the use and execution of this protocol, please refer to Carrot-Zhang et al. (2020)
Before and After: Comparison of Legacy and Harmonized TCGA Genomic Data Commons’ Data
We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)—mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations—comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the ‘legacy’ GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as ‘harmonized’ by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve. Gao et al. performed a systematic analysis of the effects of synchronizing the large-scale, widely used, multi-omic dataset of The Cancer Genome Atlas to the current human reference genome. For each of the five molecular data platforms assessed, they demonstrated a very high concordance between the ‘legacy’ GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as ‘harmonized’ by the Genomic Data Commons
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types
The chromatin accessibility landscape of primary human cancers
We present the genome-wide chromatin accessibility profiles of 410 tumor samples spanning 23 cancer types from The Cancer Genome Atlas (TCGA).We identify 562,709 transposase-accessible DNA elements that substantially extend the compendium of known cis-regulatory elements. Integration of ATAC-seq (the assay for transposase-accessible chromatin using sequencing) with TCGA multi-omic data identifies a large number of putative distal enhancers that distinguish molecular subtypes of cancers, uncovers specific driving transcription factors via protein-DNA footprints, and nominates long-range gene-regulatory interactions in cancer. These data reveal genetic risk loci of cancer predisposition as active DNA regulatory elements in cancer, identify gene-regulatory interactions underlying cancer immune evasion, and pinpoint noncoding mutations that drive enhancer activation and may affect patient survival. These results suggest a systematic approach to understanding the noncoding genome in cancer to advance diagnosis and therapy
Molecular Features of Cancers Exhibiting Exceptional Responses to Treatment
A small fraction of cancer patients with advanced disease survive significantly longer than patients with clinically comparable tumors. Molecular mechanisms for exceptional responses to therapy have been identified by genomic analysis of tumor biopsies from individual patients. Here, we analyzed tumor biopsies from an unbiased cohort of 111 exceptional responder patients using multiple platforms to profile genetic and epigenetic aberrations as well as the tumor microenvironment. Integrative analysis uncovered plausible mechanisms for the therapeutic response in nearly a quarter of the patients. The mechanisms were assigned to four broad categories—DNA damage response, intracellular signaling, immune engagement, and genetic alterations characteristic of favorable prognosis—with many tumors falling into multiple categories. These analyses revealed synthetic lethal relationships that may be exploited therapeutically and rare genetic lesions that favor therapeutic success, while also providing a wealth of testable hypotheses regarding oncogenic mechanisms that may influence the response to cancer therapy. Profiling multi-platform genomics of 110 cancer patients with an exceptional therapeutic response, Wheeler et al. identify putative molecular mechanisms explaining this survival phenotype in ∼23% of cases. Therapeutic success is related to rare molecular features of responding tumors, exploiting synthetic lethality and oncogene addiction
Whole-genome characterization of lung adenocarcinomas lacking the RTK/RAS/RAF pathway
RTK/RAS/RAF pathway alterations (RPAs) are a hallmark of lung adenocarcinoma (LUAD). In this study, we use whole-genome sequencing (WGS) of 85 cases found to be RPA(−) by previous studies from The Cancer Genome Atlas (TCGA) to characterize the minority of LUADs lacking apparent alterations in this pathway. We show that WGS analysis uncovers RPA(+) in 28 (33%) of the 85 samples. Among the remaining 57 cases, we observe focal deletions targeting the promoter or transcription start site of STK11 (n = 7) or KEAP1 (n = 3), and promoter mutations associated with the increased expression of ILF2 (n = 6). We also identify complex structural variations associated with high-level copy number amplifications. Moreover, an enrichment of focal deletions is found in TP53 mutant cases. Our results indicate that RPA(−) cases demonstrate tumor suppressor deletions and genome instability, but lack unique or recurrent genetic lesions compensating for the lack of RPAs. Larger WGS studies of RPA(−) cases are required to understand this important LUAD subset. © 2021 The AuthorsCarrot-Zhang et al. perform whole-genome characterization of lung adenocarcinomas (LUADs) lacking RTK/RAS/RAF pathway alterations (RPAs) and identify mutations or structural variants in both coding and non-coding spaces that define a unique entity of RPA(−) LUADs and potentially explain the underlying biology of this disease
Comprehensive Analysis of Genetic Ancestry and Its Molecular Correlates in Cancer
We evaluated ancestry effects on mutation rates, DNA methylation, and mRNA and miRNA expression among 10,678 patients across 33 cancer types from The Cancer Genome Atlas. We demonstrated that cancer subtypes and ancestry-related technical artifacts are important confounders that have been insufficiently accounted for. Once accounted for, ancestry-associated differences spanned all molecular features and hundreds of genes. Biologically significant differences were usually tissue specific but not specific to cancer. However, admixture and pathway analyses suggested some of these differences are causally related to cancer. Specific findings included increased FBXW7 mutations in patients of African origin, decreased VHL and PBRM1 mutations in renal cancer patients of African origin, and decreased immune activity in bladder cancer patients of East Asian origin
The Integrated Genomic Landscape of Thymic Epithelial Tumors
Thymic epithelial tumors (TETs) are one of the rarest adult malignancies. Among TETs, thymoma is the most predominant, characterized by a unique association with autoimmune diseases, followed by thymic carcinoma, which is less common but more clinically aggressive. Using multi-platform omics analyses on 117 TETs, we define four subtypes of these tumors defined by genomic hallmarks and an association with survival and World Health Organization histological subtype. We further demonstrate a marked prevalence of a thymoma-specific mutated oncogene, GTF2I, and explore its biological effects on multi-platform analysis. We further observe enrichment of mutations in HRAS, NRAS, and TP53. Last, we identify a molecular link between thymoma and the autoimmune disease myasthenia gravis, characterized by tumoral overexpression of muscle autoantigens, and increased aneuploidy. Radovich et al. perform multi-platform analyses of thymic epithelial tumors. They identify high prevalence of GTF2I mutations and enrichment of mutations in HRAS, NRAS, and TP53 and link overexpression of muscle autoantigens and increased aneuploidy in thymoma and patients’ risk of having myasthenia gravis
Integrated Molecular Characterization of Testicular Germ Cell Tumors
We studied 137 primary testicular germ cell tumors (TGCTs) using high-dimensional assays of genomic, epigenomic, transcriptomic, and proteomic features. These tumors exhibited high aneuploidy and a paucity of somatic mutations. Somatic mutation of only three genes achieved significance—KIT, KRAS, and NRAS—exclusively in samples with seminoma components. Integrated analyses identified distinct molecular patterns that characterized the major recognized histologic subtypes of TGCT: seminoma, embryonal carcinoma, yolk sac tumor, and teratoma. Striking differences in global DNA methylation and microRNA expression between histology subtypes highlight a likely role of epigenomic processes in determining histologic fates in TGCTs. We also identified a subset of pure seminomas defined by KIT mutations, increased immune infiltration, globally demethylated DNA, and decreased KRAS copy number. We report potential biomarkers for risk stratification, such as miRNA specifically expressed in teratoma, and others with molecular diagnostic potential, such as CpH (CpA/CpC/CpT) methylation identifying embryonal carcinomas. Shen et al. identify molecular characteristics that classify testicular germ cell tumor types, including a separate subset of seminomas defined by KIT mutations. This provides a set of candidate biomarkers for risk stratification and potential therapeutic targeting
Driver Fusions and Their Implications in the Development and Treatment of Human Cancers.
Gene fusions represent an important class of somatic alterations in cancer. We systematically investigated fusions in 9,624 tumors across 33 cancer types using multiple fusion calling tools. We identified a total of 25,664 fusions, with a 63% validation rate. Integration of gene expression, copy number, and fusion annotation data revealed that fusions involving oncogenes tend to exhibit increased expression, whereas fusions involving tumor suppressors have the opposite effect. For fusions involving kinases, we found 1,275 with an intact kinase domain, the proportion of which varied significantly across cancer types. Our study suggests that fusions drive the development of 16.5% of cancer cases and function as the sole driver in more than 1% of them. Finally, we identified druggable fusions involving genes such as TMPRSS2, RET, FGFR3, ALK, and ESR1 in 6.0% of cases, and we predicted immunogenic peptides, suggesting that fusions may provide leads for targeted drug and immune therapy
- …