177 research outputs found

    Improved indel detection in DNA and RNA via realignment with ABRA2

    Get PDF
    Motivation: Genomic variant detection from next-generation sequencing has become established as an extremely important component of research and clinical diagnoses in both cancer and Mendelian disorders. Insertions and deletions (indels) are a common source of variation and can frequently impact functionality, thus making their detection vitally important. While substantial effort has gone into detecting indels from DNA, there is still opportunity for improvement. Further, detection of indels from RNA-Seq data has largely been an afterthought and offers another critical area for variant detection. Results: We present here ABRA2, a redesign of the original ABRA implementation that offers support for realignment of both RNA and DNA short reads. The process results in improved accuracy and scalability including support for human whole genomes. Results demonstrate substantial improvement in indel detection for a variety of data types, including those that were not previously supported by ABRA. Further, ABRA2 results in broad improvements to variant calling accuracy across a wide range of post-processing workflows including whole genomes, targeted exomes and transcriptome sequencing

    Genetic determinants of the molecular portraits of epithelial cancers

    Get PDF
    The ability to characterize and predict tumor phenotypes is crucial to precision medicine. In this study, we present an integrative computational approach using a genome-wide association analysis and an Elastic Net prediction method to analyze the relationship between DNA copy number alterations and an archive of gene expression signatures. Across breast cancers, we are able to quantitatively predict many gene signatures levels within individual tumors with high accuracy based upon DNA copy number features alone, including proliferation status and Estrogen-signaling pathway activity. We can also predict many other key phenotypes, including intrinsic molecular subtypes, estrogen receptor status, and TP53 mutation. This approach is also applied to TCGA Pan-Cancer, which identify repeatedly predictable signatures across tumor types including immune features in lung squamous and basal-like breast cancers. These Elastic Net DNA predictors could also be called from DNA-based gene panels, thus facilitating their use as biomarkers to guide therapeutic decision making

    High reproducibility using sodium hydroxide-stripped long oligonucleotide DNA microarrays

    Get PDF
    Recently, long oligonucleotide (60- to 70-mer) microarrays for two-color experiments have been developed and are gaining widespread use. In addition, when there is limited availability of mRNA from tissue sources, RNA amplification can and is being used to produce sufficient quantities of cRNA for microarray hybridization. Taking advantage of the selective degradation of RNA under alkaline conditions, we have developed a method to "strip" glass-based oligonucleotide microarrays that use fluorescent RNA in the hybridization, while leaving the DNA oligonucleotide probes intact and usable for a second experiment. Replicate microarray experiments conducted using stripped arrays showed high reproducibility, however, we found that arrays could only be stripped and reused once without compromising data quality. The intraclass correlation (ICC) between a virgin array and a stripped array hybridized with the same sample showed a range of 0.90-0.98, which is comparable to the ICC of two virgin arrays hybridized with the same sample. Using this method, once-stripped oligonucleotide microarrays are usable, reliable, and help to reduce costs

    Amplification of SOX4 promotes PI3K/Akt signaling in human breast cancer

    Get PDF
    Purpose: The PI3K/Akt signaling axis contributes to the dysregulation of many dominant features in breast cancer including cell proliferation, survival, metabolism, motility, and genomic instability. While multiple studies have demonstrated that basal-like or triple-negative breast tumors have uniformly high PI3K/Akt activity, genomic alterations that mediate dysregulation of this pathway in this subset of highly aggressive breast tumors remain to be determined. Methods: In this study, we present an integrated genomic analysis based on the use of a PI3K gene expression signature as a framework to analyze orthogonal genomic data from human breast tumors, including RNA expression, DNA copy number alterations, and protein expression. In combination with data from a genome-wide RNA-mediated interference screen in human breast cancer cell lines, we identified essential genetic drivers of PI3K/Akt signaling. Results: Our in silico analyses identified SOX4 amplification as a novel modulator of PI3K/Akt signaling in breast cancers and in vitro studies confirmed its role in regulating Akt phosphorylation. Conclusions: Taken together, these data establish a role for SOX4-mediated PI3K/Akt signaling in breast cancer and suggest that SOX4 may represent a novel therapeutic target and/or biomarker for current PI3K family therapies

    A pan-cancer analysis of the frequency of DNA alterations across cell cycle activity levels

    Get PDF
    Pan-cancer genomic analyses based on the magnitude of pathway activity are currently lacking. Focusing on the cell cycle, we examined the DNA mutations and chromosome arm-level aneuploidy within tumours with low, intermediate and high cell-cycle activity in 9515 pan-cancer patients with 32 different tumour types. Boxplots showed that cell-cycle activity varied broadly across and within all cancers. TP53 and PIK3CA mutations were common in all cell cycle score (CCS) tertiles but with increasing frequency as cell-cycle activity levels increased (P < 0.001). Mutations in BRAF and gains in 16p were less frequent in CCS High tumours (P < 0.001). In Kaplan–Meier analysis, patients whose tumours were CCS Low had a longer Progression Free Interval (PFI) relative to Intermediate or High (P < 0.001) and this significance remained in multivariable analysis (CCS Intermediate: HR = 1.37; 95% CI 1.17–1.60, CCS High: 1.54; 1.29–1.84, CCS Low = Ref). These results demonstrate that whilst similar DNA alterations can be found at all cell-cycle activity levels, some notable exceptions exist. Moreover, independent prognostic information can be derived on a pan-cancer level from a simple measure of cell-cycle activity

    Anti-PD-1 Checkpoint Therapy Can Promote the Function and Survival of Regulatory T Cells

    Get PDF
    We have previously shown in a model of claudin-low breast cancer that regulatory T cells (Tregs) are increased in the tumor microenvironment (TME) and express high levels of PD-1. In mouse models and patients with triple-negative breast cancer, it is postulated that one cause for the lack of activity of anti-PD-1 therapy is the activation of PD-1-expressing Tregs in the TME. We hypothesized that the expression of PD-1 on Tregs would lead to enhanced suppressive function of Tregs and worsen antitumor immunity during PD-1 blockade. To evaluate this, we isolated Tregs from claudin-low tumors and functionally evaluated them ex vivo. We compared transcriptional profiles of Tregs isolated from tumor-bearing mice with or without anti-PD-1 therapy using RNA sequencing. We found several genes associated with survival and proliferation pathways; for example, Jun, Fos, and Bcl2 were significantly upregulated in Tregs exposed to anti-PD-1 treatment. Based on these data, we hypothesized that anti-PD-1 treatment on Tregs results in a prosurvival phenotype. Indeed, Tregs exposed to PD-1 blockade had significantly higher levels of Bcl-2 expression, and this led to increased protection from glucocorticoid-induced apoptosis. In addition, we found in vitro and in vivo that Tregs in the presence of anti-PD-1 proliferated more than control Tregs. PD-1 blockade significantly increased the suppressive activity of Tregs at biologically relevant Treg/Tnaive cell ratios. Altogether, we show that this immunotherapy blockade increases proliferation, protection from apoptosis, and suppressive capabilities of Tregs, thus leading to enhanced immunosuppression in the TME

    Assembly-based inference of B-cell receptor repertoires from short read RNA sequencing data with V'DJer

    Get PDF
    Motivation: B-cell receptor (BCR) repertoire profiling is an important tool for understanding the biology of diverse immunologic processes. Current methods for analyzing adaptive immune receptor repertoires depend upon PCR amplification of VDJ rearrangements followed by long read amplicon sequencing spanning the VDJ junctions. While this approach has proven to be effective, it is frequently not feasible due to cost or limited sample material. Additionally, there are many existing datasets where short-read RNA sequencing data are available but PCR amplified BCR data are not. Results: We present here V'DJer, an assembly-based method that reconstructs adaptive immune receptor repertoires from short-read RNA sequencing data. This method captures expressed BCR loci from a standard RNA-seq assay. We applied this method to 473 Melanoma samples from The Cancer Genome Atlas and demonstrate V'DJer's ability to accurately reconstruct BCR repertoires from short read mRNA-seq data

    Virus expression detection reveals RNA-sequencing contamination in TCGA

    Get PDF
    Background: Contamination of reagents and cross contamination across samples is a long-recognized issue in molecular biology laboratories. While often innocuous, contamination can lead to inaccurate results. Cantalupo et al., for example, found HeLa-derived human papillomavirus 18 (H-HPV18) in several of The Cancer Genome Atlas (TCGA) RNA-sequencing samples. This work motivated us to assess a greater number of samples and determine the origin of possible contaminations using viral sequences. To detect viruses with high specificity, we developed the publicly available workflow, VirDetect, that detects virus and laboratory vector sequences in RNA-seq samples. We applied VirDetect to 9143 RNA-seq samples sequenced at one TCGA sequencing center (28/33 cancer types) over 5 years. Results: We confirmed that H-HPV18 was present in many samples and determined that viral transcripts from H-HPV18 significantly co-occurred with those from xenotropic mouse leukemia virus-related virus (XMRV). Using laboratory metadata and viral transcription, we determined that the likely contaminant was a pool of cell lines known as the "common reference", which was sequenced alongside TCGA RNA-seq samples as a control to monitor quality across technology transitions (i.e. microarray to GAII to HiSeq), and to link RNA-seq to previous generation microarrays that standardly used the "common reference". One of the cell lines in the pool was a laboratory isolate of MCF-7, which we discovered was infected with XMRV; another constituent of the pool was likely HeLa cells. Conclusions: Altogether, this indicates a multi-step contamination process. First, MCF-7 was infected with an XMRV. Second, this infected cell line was added to a pool of cell lines, which contained HeLa. Finally, RNA from this pool of cell lines contaminated several TCGA tumor samples most-likely during library construction. Thus, these human tumors with H-HPV or XMRV reads were likely not infected with H-HPV 18 or XMRV

    Alterations in Wnt- and/or STAT3 signaling pathways and the immune microenvironment during metastatic progression

    Get PDF
    Metastatic breast cancer is an extremely complex disease with limited treatment options due to the lack of information about the major characteristics of metastatic disease. There is an urgent need, therefore, to understand the changes in cellular complexity and dynamics that occur during metastatic progression. In the current study, we analyzed the cellular and molecular differences between primary tumors and paired lung metastases using a syngeneic p53-null mammary tumor model of basal-like breast cancer. Distinct subpopulations driven by the Wnt- and/or STAT3 signaling pathways were detected in vivo using a lentiviral Wnt- and STAT3 signaling reporter system. A significant increase in the overlapping populations driven by both the Wnt- and STAT3 signaling pathways was observed in the lung metastases as compared to the primary tumors. Furthermore, the overlapping populations showed a higher metastatic potential relative to the other populations and pharmacological inhibition of both signaling pathways was shown to markedly reduce the metastatic lesions in established lung metastases. An analysis of the unique molecular features of the lung metastases revealed a significant association with immune response signatures. Specifically, Foxp3 gene expression was markedly increased and elevated levels of Foxp3 + Treg cells were detected in close proximity to lung metastases. Collectively, these studies illustrate the importance of analyzing intratumoral heterogeneity, changes in population dynamics, and the immune microenvironment during metastatic progression

    A framework for transcriptome-wide association studies in breast cancer in diverse study populations

    Get PDF
    Background: The relationship between germline genetic variation and breast cancer survival is largely unknown, especially in understudied minority populations who often have poorer survival. Genome-wide association studies (GWAS) have interrogated breast cancer survival but often are underpowered due to subtype heterogeneity and clinical covariates and detect loci in non-coding regions that are difficult to interpret. Transcriptome-wide association studies (TWAS) show increased power in detecting functionally relevant loci by leveraging expression quantitative trait loci (eQTLs) from external reference panels in relevant tissues. However, ancestry- or race-specific reference panels may be needed to draw correct inference in ancestrally diverse cohorts. Such panels for breast cancer are lacking. Results: We provide a framework for TWAS for breast cancer in diverse populations, using data from the Carolina Breast Cancer Study (CBCS), a population-based cohort that oversampled black women. We perform eQTL analysis for 406 breast cancer-related genes to train race-stratified predictive models of tumor expression from germline genotypes. Using these models, we impute expression in independent data from CBCS and TCGA, accounting for sampling variability in assessing performance. These models are not applicable across race, and their predictive performance varies across tumor subtype. Within CBCS (N = 3,828), at a false discovery-adjusted significance of 0.10 and stratifying for race, we identify associations in black women near AURKA, CAPN13, PIK3CA, and SERPINB5 via TWAS that are underpowered in GWAS. Conclusions: We show that carefully implemented and thoroughly validated TWAS is an efficient approach for understanding the genetics underpinning breast cancer outcomes in diverse populations
    • …
    corecore