74,914 research outputs found
Comparison of TCGA and GENIE genomic datasets for the detection of clinically actionable alterations in breast cancer.
Whole exome sequencing (WES), targeted gene panel sequencing and single nucleotide polymorphism (SNP) arrays are increasingly used for the identification of actionable alterations that are critical to cancer care. Here, we compared The Cancer Genome Atlas (TCGA) and the Genomics Evidence Neoplasia Information Exchange (GENIE) breast cancer genomic datasets (array and next generation sequencing (NGS) data) in detecting genomic alterations in clinically relevant genes. We performed an in silico analysis to determine the concordance in the frequencies of actionable mutations and copy number alterations/aberrations (CNAs) in the two most common breast cancer histologies, invasive lobular and invasive ductal carcinoma. We found that targeted sequencing identified a larger number of mutational hotspots and clinically significant amplifications that would have been missed by WES and SNP arrays in many actionable genes such as PIK3CA, EGFR, AKT3, FGFR1, ERBB2, ERBB3 and ESR1. The striking differences between the number of mutational hotspots and CNAs generated from these platforms highlight a number of factors that should be considered in the interpretation of array and NGS-based genomic data for precision medicine. Targeted panel sequencing was preferable to WES to define the full spectrum of somatic mutations present in a tumor
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types
Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study
BACKGROUND: For virtually every patient with colorectal cancer (CRC), hematoxylin-eosin (HE)-stained tissue slides are available. These images contain quantitative information, which is not routinely used to objectively extract prognostic biomarkers. In the present study, we investigated whether deep convolutional neural networks (CNNs) can extract prognosticators directly from these widely available images.
METHODS AND FINDINGS: We hand-delineated single-tissue regions in 86 CRC tissue slides, yielding more than 100,000 HE image patches, and used these to train a CNN by transfer learning, reaching a nine-class accuracy of >94% in an independent data set of 7,180 images from 25 CRC patients. With this tool, we performed automated tissue decomposition of representative multitissue HE images from 862 HE slides in 500 stage I-IV CRC patients in the The Cancer Genome Atlas (TCGA) cohort, a large international multicenter collection of CRC tissue. Based on the output neuron activations in the CNN, we calculated a "deep stroma score," which was an independent prognostic factor for overall survival (OS) in a multivariable Cox proportional hazard model (hazard ratio [HR] with 95% confidence interval [CI]: 1.99 [1.27-3.12], p = 0.0028), while in the same cohort, manual quantification of stromal areas and a gene expression signature of cancer-associated fibroblasts (CAFs) were only prognostic in specific tumor stages. We validated these findings in an independent cohort of 409 stage I-IV CRC patients from the "Darmkrebs: Chancen der Verhütung durch Screening" (DACHS) study who were recruited between 2003 and 2007 in multiple institutions in Germany. Again, the score was an independent prognostic factor for OS (HR 1.63 [1.14-2.33], p = 0.008), CRC-specific OS (HR 2.29 [1.5-3.48], p = 0.0004), and relapse-free survival (RFS; HR 1.92 [1.34-2.76], p = 0.0004). A prospective validation is required before this biomarker can be implemented in clinical workflows.
CONCLUSIONS: In our retrospective study, we show that a CNN can assess the human tumor microenvironment and predict prognosis directly from histopathological images
Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images
Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images
of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL
maps are derived through computational staining using a convolutional neural network trained to
classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and
correlation with overall survival. TIL map structural patterns were grouped using standard
histopathological parameters. These patterns are enriched in particular T cell subpopulations
derived from molecular measures. TIL densities and spatial structure were differentially enriched
among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial
infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic
patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for
the TCGA image archives with insights into the tumor-immune microenvironment
High MYC mRNA expression is more clinically relevant than MYC DNA amplification in triple-negative breast cancer
DNA abnormalities are used in inclusion criteria of clinical trials for treatments with specific targeted molecules. MYC is one of the most powerful oncogenes and is known to be associated with triple-negative breast cancer (TNBC). Its DNA amplification is often part of the targeted DNA-sequencing panels under the assumption of reflecting upregulated signaling. However, it remains unclear if MYC DNA amplification is a surrogate of its upregulated signaling. Thus, we investigated the difference between MYC DNA amplification and mRNA high expression in TNBCs utilizing publicly available cohorts. MYC DNA amplified tumors were found to have various mRNA expression levels, suggesting that MYC DNA amplification does not always result in elevated MYC mRNA expression. Compared to other subtypes, both MYC DNA amplification and mRNA high expression were more frequent in the TNBCs. MYC mRNA high expression, but not DNA amplification, was significantly associated with worse overall survival in the TNBCs. The TNBCs with MYC mRNA high expression enriched MYC target genes, cell cycle related genes, and WNT/β-catenin gene sets, whereas none of them were enriched in MYC DNA amplified TNBCs. In conclusion, MYC mRNA high expression, but not DNA amplification, reflects not only its upregulated signaling pathway, but also clinical significance in TNBCs
Recommended from our members
Comprehensive transcriptomic analysis of cell lines as models of primary tumors across 22 tumor types.
Cancer cell lines are a cornerstone of cancer research but previous studies have shown that not all cell lines are equal in their ability to model primary tumors. Here we present a comprehensive pan-cancer analysis utilizing transcriptomic profiles from The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia to evaluate cell lines as models of primary tumors across 22 tumor types. We perform correlation analysis and gene set enrichment analysis to understand the differences between cell lines and primary tumors. Additionally, we classify cell lines into tumor subtypes in 9 tumor types. We present our pancreatic cancer results as a case study and find that the commonly used cell line MIA PaCa-2 is transcriptionally unrepresentative of primary pancreatic adenocarcinomas. Lastly, we propose a new cell line panel, the TCGA-110-CL, for pan-cancer studies. This study provides a resource to help researchers select more representative cell line models
Robust Identification of Target Genes and Outliers in Triple-negative Breast Cancer Data
Correct classification of breast cancer sub-types is of high importance as it
directly affects the therapeutic options. We focus on triple-negative breast
cancer (TNBC) which has the worst prognosis among breast cancer types. Using
cutting edge methods from the field of robust statistics, we analyze Breast
Invasive Carcinoma (BRCA) transcriptomic data publicly available from The
Cancer Genome Atlas (TCGA) data portal. Our analysis identifies statistical
outliers that may correspond to misdiagnosed patients. Furthermore, it is
illustrated that classical statistical methods may fail in the presence of
these outliers, prompting the need for robust statistics. Using robust sparse
logistic regression we obtain 36 relevant genes, of which ca. 60\% have been
previously reported as biologically relevant to TNBC, reinforcing the validity
of the method. The remaining 14 genes identified are new potential biomarkers
for TNBC. Out of these, JAM3, SFT2D2 and PAPSS1 were previously associated to
breast tumors or other types of cancer. The relevance of these genes is
confirmed by the new DetectDeviatingCells (DDC) outlier detection technique. A
comparison of gene networks on the selected genes showed significant
differences between TNBC and non-TNBC data. The individual role of FOXA1 in
TNBC and non-TNBC, and the strong FOXA1-AGR2 connection in TNBC stand out. Not
only will our results contribute to the breast cancer/TNBC understanding and
ultimately its management, they also show that robust regression and outlier
detection constitute key strategies to cope with high-dimensional clinical data
such as omics data
Recommended from our members
Chromatin dysregulation and DNA methylation at transcription start sites associated with transcriptional repression in cancers.
Although promoter-associated CpG islands have been established as targets of DNA methylation changes in cancer, previous studies suggest that epigenetic dysregulation outside the promoter region may be more closely associated with transcriptional changes. Here we examine DNA methylation, chromatin marks, and transcriptional alterations to define the relationship between transcriptional modulation and spatial changes in chromatin structure. Using human papillomavirus-related oropharyngeal carcinoma as a model, we show aberrant enrichment of repressive H3K9me3 at the transcriptional start site (TSS) with methylation-associated, tumor-specific gene silencing. Further analysis identifies a hypermethylated subtype which shows a functional convergence on MYC targets and association with CREBBP/EP300 mutation. The tumor-specific shift to transcriptional repression associated with DNA methylation at TSSs was confirmed in multiple tumor types. Our data may show a common underlying epigenetic dysregulation in cancer associated with broad enrichment of repressive chromatin marks and aberrant DNA hypermethylation at TSSs in combination with MYC network activation
- …
