114 research outputs found
Genetic determinants of the molecular portraits of epithelial cancers
The ability to characterize and predict tumor phenotypes is crucial to precision medicine. In this study, we present an integrative computational approach using a genome-wide association analysis and an Elastic Net prediction method to analyze the relationship between DNA copy number alterations and an archive of gene expression signatures. Across breast cancers, we are able to quantitatively predict many gene signatures levels within individual tumors with high accuracy based upon DNA copy number features alone, including proliferation status and Estrogen-signaling pathway activity. We can also predict many other key phenotypes, including intrinsic molecular subtypes, estrogen receptor status, and TP53 mutation. This approach is also applied to TCGA Pan-Cancer, which identify repeatedly predictable signatures across tumor types including immune features in lung squamous and basal-like breast cancers. These Elastic Net DNA predictors could also be called from DNA-based gene panels, thus facilitating their use as biomarkers to guide therapeutic decision making
Race and Ancestry in Immune Response to Breast Cancer
Martini and colleagues performed genetic ancestry estimation on a unique international triple-negative breast cancer (TNBC) study enriched for participants with African ancestry. They identified gene signatures indicative of ancestry in race-associated TNBC and found ancestry-associated immunologic differences that may contribute to racial disparities in breast cancer
BIDIMENSIONAL LINKED MATRIX FACTORIZATION FOR PAN-OMICS PAN-CANCER ANALYSIS
Several modern applications require the integration of multiple large data matrices that have shared rows and/or columns. For example, cancer studies that integrate multiple omics platforms across multiple types of cancer, pan-omics pan-cancer analysis, have extended our knowledge of molecular heterogeneity beyond what was observed in single tumor and single platform studies. However, these studies have been limited by available statistical methodology. We propose a flexible approach to the simultaneous factorization and decomposition of variation across such bidimensionally linked matrices, BIDIFAC+. BIDIFAC+ decomposes variation into a series of low-rank components that may be shared across any number of row sets (e.g., omics platforms) or column sets (e.g., cancer types). This builds on a growing literature for the factorization and decomposition of linked matrices which has primarily focused on multiple matrices that are linked in one dimension (rows or columns) only. Our objective function extends nuclear norm penalization, is motivated by random matrix theory, gives a unique decomposition under relatively mild conditions, and can be shown to give the mode of a Bayesian posterior distribution. We apply BIDIFAC+ to pan-omics pan-cancer data from TCGA, identifying shared and specific modes of variability across four different omics platforms and 29 different cancer types
A hierarchical spike-and-slab model for pan-cancer survival using pan-omic data
Background: Pan-omics, pan-cancer analysis has advanced our understanding of the molecular heterogeneity of cancer. However, such analyses have been limited in their ability to use information from multiple sources of data (e.g., omics platforms) and multiple sample sets (e.g., cancer types) to predict clinical outcomes. We address the issue of prediction across multiple high-dimensional sources of data and sample sets by using molecular patterns identified by BIDIFAC+, a method for integrative dimension reduction of bidimensionally-linked matrices, in a Bayesian hierarchical model. Our model performs variable selection through spike-and-slab priors that borrow information across clustered data. We use this model to predict overall patient survival from the Cancer Genome Atlas with data from 29 cancer types and 4 omics sources and use simulations to characterize the performance of the hierarchical spike-and-slab prior. Results: We found that molecular patterns shared across all or most cancers were largely not predictive of survival. However, our model selected patterns unique to subsets of cancers that differentiate clinical tumor subtypes with markedly different survival outcomes. Some of these subtypes were previously established, such as subtypes of uterine corpus endometrial carcinoma, while others may be novel, such as subtypes within a set of kidney carcinomas. Through simulations, we found that the hierarchical spike-and-slab prior performs best in terms of variable selection accuracy and predictive power when borrowing information is advantageous, but also offers competitive performance when it is not. Conclusions: We address the issue of prediction across multiple sources of data by using results from BIDIFAC+ in a Bayesian hierarchical model for overall patient survival. By incorporating spike-and-slab priors that borrow information across cancers, we identified molecular patterns that distinguish clinical tumor subtypes within a single cancer and within a group of cancers. We also corroborate the flexibility and performance of using spike-and-slab priors as a Bayesian variable selection approach
A Pan-Cancer and Polygenic Bayesian Hierarchical Model for the Effect of Somatic Mutations on Survival
We built a novel Bayesian hierarchical survival model based on the somatic mutation profile of patients across 50 genes and 27 cancer types. The pan-cancer quality allows for the model to “borrow” information across cancer types, motivated by the assumption that similar mutation profiles may have similar (but not necessarily identical) effects on survival across different tissues of origin or tumor types. The effect of a mutation at each gene was allowed to vary by cancer type, whereas the mean effect of each gene was shared across cancers. Within this framework, we considered 4 parametric survival models (normal, log-normal, exponential, and Weibull), and we compared their performance via a cross-validation approach in which we fit each model on training data and estimate the log-posterior predictive likelihood on test data. The log-normal model gave the best fit, and we investigated the partial effect of each gene on survival via a forward selection procedure. Through this we determined that mutations at TP53 and FAT4 were together the most useful for predicting patient survival. We validated the model via simulation to ensure that our algorithm for posterior computation gave nominal coverage rates. The code used for this analysis can be found at https://github.com/sarahsamorodnitsky/Pan-Cancer-Survival-Modeling.git, and the results are summarized at http://ericfrazerlock.com/surv_figs/SurvivalDisplay.html
Amplification of SOX4 promotes PI3K/Akt signaling in human breast cancer
Purpose: The PI3K/Akt signaling axis contributes to the dysregulation of many dominant features in breast cancer including cell proliferation, survival, metabolism, motility, and genomic instability. While multiple studies have demonstrated that basal-like or triple-negative breast tumors have uniformly high PI3K/Akt activity, genomic alterations that mediate dysregulation of this pathway in this subset of highly aggressive breast tumors remain to be determined. Methods: In this study, we present an integrated genomic analysis based on the use of a PI3K gene expression signature as a framework to analyze orthogonal genomic data from human breast tumors, including RNA expression, DNA copy number alterations, and protein expression. In combination with data from a genome-wide RNA-mediated interference screen in human breast cancer cell lines, we identified essential genetic drivers of PI3K/Akt signaling. Results: Our in silico analyses identified SOX4 amplification as a novel modulator of PI3K/Akt signaling in breast cancers and in vitro studies confirmed its role in regulating Akt phosphorylation. Conclusions: Taken together, these data establish a role for SOX4-mediated PI3K/Akt signaling in breast cancer and suggest that SOX4 may represent a novel therapeutic target and/or biomarker for current PI3K family therapies
Virus expression detection reveals RNA-sequencing contamination in TCGA
Background: Contamination of reagents and cross contamination across samples is a long-recognized issue in molecular biology laboratories. While often innocuous, contamination can lead to inaccurate results. Cantalupo et al., for example, found HeLa-derived human papillomavirus 18 (H-HPV18) in several of The Cancer Genome Atlas (TCGA) RNA-sequencing samples. This work motivated us to assess a greater number of samples and determine the origin of possible contaminations using viral sequences. To detect viruses with high specificity, we developed the publicly available workflow, VirDetect, that detects virus and laboratory vector sequences in RNA-seq samples. We applied VirDetect to 9143 RNA-seq samples sequenced at one TCGA sequencing center (28/33 cancer types) over 5 years. Results: We confirmed that H-HPV18 was present in many samples and determined that viral transcripts from H-HPV18 significantly co-occurred with those from xenotropic mouse leukemia virus-related virus (XMRV). Using laboratory metadata and viral transcription, we determined that the likely contaminant was a pool of cell lines known as the "common reference", which was sequenced alongside TCGA RNA-seq samples as a control to monitor quality across technology transitions (i.e. microarray to GAII to HiSeq), and to link RNA-seq to previous generation microarrays that standardly used the "common reference". One of the cell lines in the pool was a laboratory isolate of MCF-7, which we discovered was infected with XMRV; another constituent of the pool was likely HeLa cells. Conclusions: Altogether, this indicates a multi-step contamination process. First, MCF-7 was infected with an XMRV. Second, this infected cell line was added to a pool of cell lines, which contained HeLa. Finally, RNA from this pool of cell lines contaminated several TCGA tumor samples most-likely during library construction. Thus, these human tumors with H-HPV or XMRV reads were likely not infected with H-HPV 18 or XMRV
Prognostic value of B cells in cutaneous melanoma
Background: Measures of the adaptive immune response have prognostic and predictive associations in melanoma and other cancer types. Specifically, intratumoral T cell density and function have considerable prognostic and predictive value in skin cutaneous melanoma (SKCM). Less is known about the significance of tumor-infiltrating B cells in SKCM. Our goal was to understand the prognostic and predictive value of B cell phenotypic subsets in SKCM using RNA sequencing. Methods: We used our previously published algorithm, V'DJer, to assemble B cell receptor (BCR) repertoires and estimate diversity from short-read RNA sequencing (RNA-seq). We applied machine learning-based cellular phenotype classifiers to measure relative similarity of bulk tumor sample gene expression profiles and different B cell phenotypes. We assessed these aspects of B cell biology in 473 SKCM from the Cancer Genome Atlas Project (TCGA) as well as in RNA-seq data corresponding to tumor samples procured from patients who received CTLA-4 and PD-1 inhibitors for metastatic SKCM. Results: We found that the BCR repertoire was associated with different clinical factors, such as tumor tissue site and sex. However, increased clonality of the BCR repertoire was favorably prognostic in SKCM and was prognostic even after first conditioning on various clinical factors. Mutation burden was not correlated with any BCR measurement, and no specific mutation had an altered BCR repertoire. Lack of an assembled BCR in pre-treatment tumor tissues was associated with a lack of anti-tumor response to a CTLA-4 inhibitor in metastatic SKCM. Conclusions: These findings suggest an important prognostic and predictive role for B cell characteristics in SKCM. This has implications for melanoma immunobiology and potential development of immunogenomics features to predict survival and response to immunotherapy
Age at diagnosis, obesity, smoking, and molecular subtypes in muscle-invasive bladder cancer
Background: Heterogeneity of muscle-invasive bladder cancer (MIBC) has been characterized using whole-genome mRNA expression data, showing distinct molecular and clinicopathological characteristics by subtypes. However, associations between risk factors and molecular subtypes have not been reported. Methods: Four previously published schemes were used to categorize molecular subtypes in 372 MIBC patients from the Cancer Genome Atlas (TCGA). Data on gene expression (RNA-seq), demographic, and clinicopathological characteristics were retrieved through TCGA data portal. Polytomous logistic regression was used to estimate the associations of subtypes by different schemes with age at diagnosis, obesity, and smoking. Results: While some quantitative variation was evident, distinct molecular subtype schemes showed considerable consistency in the association with the risk factors. Generally, compared to patients with luminal-like tumors, patients with basal-like subtypes were more likely to be older (OR75 + yrs vs. <60 years range = 1.32–2.89), obese (ORobese vs. normal range = 1.30–3.05), and to start smoking at early age (OR<18 years vs. 25+ years range = 1.11–4.57). Conclusions: Different molecular subtypes of MIBC may have distinct risk profiles. Large population-based studies with detailed information on bladder cancer risk factors are needed to further define etiologic heterogeneity for bladder cancer
FOXM1 Deubiquitination by USP21 Regulates Cell Cycle Progression and Paclitaxel Sensitivity in Basal-like Breast Cancer
The cell cycle transcription factor FOXM1 is activated in basal-like breast cancer (BLBC) and associated with therapeutic resistance and poor patient outcomes. Arceci et al. show USP21 antagonizes FOXM1 degradation, thereby promoting proliferation and paclitaxel resistance. USP21 is catalytically active and recurrently overexpressed in BLBC, representing a potential therapeutic target. © 2019 The Author(s)The transcription factor FOXM1 contributes to cell cycle progression and is significantly upregulated in basal-like breast cancer (BLBC). Despite its importance in normal and cancer cell cycles, we lack a complete understanding of mechanisms that regulate FOXM1. We identified USP21 in an RNAi-based screen for deubiquitinases that control FOXM1 abundance. USP21 increases the stability of FOXM1, and USP21 binds and deubiquitinates FOXM1 in vivo and in vitro, indicating a direct enzyme-substrate relationship. Depleting USP21 downregulates the FOXM1 transcriptional network and causes a significant delay in cell cycle progression. Significantly, USP21 depletion sensitized BLBC cell lines and mouse xenograft tumors to paclitaxel, an anti-mitotic, frontline therapy in BLBC treatment. USP21 is the most frequently amplified deubiquitinase in BLBC patient tumors, and its amplification co-occurs with the upregulation of FOXM1 protein. Altogether, these data suggest a role for USP21 in the proliferation and potentially treatment of FOXM1-high, USP21-high BLBC
- …