30 research outputs found
Targeted long-read sequencing reveals clonally expanded HBV-associated chromosomal translocations in patients with chronic hepatitis B
Chronic HBV; Clonal expansion; Targeted sequencingVHB crónico; Expansión clónica; Secuenciación dirigidaVHB crònic; Expansió clonal; Seqüenciació dirigidaBackground & Aims
HBV infects over 257 million people worldwide and is associated with the development of hepatocellular carcinoma (HCC). Integration of HBV DNA into the host genome is likely a key driver of HCC oncogenesis. Here, we utilise targeted long-read sequencing to determine the structure of HBV DNA integrations as well as full isoform information of HBV mRNA with more accurate quantification than traditional next generation sequencing platforms.
Methods
DNA and RNA were isolated from fresh frozen liver biopsies collected within the GS-US-174-0149 clinical trial. A pan-genotypic panel of biotinylated oligos was developed to enrich for HBV sequences from sheared genomic DNA (∼7 kb) and full-length cDNA libraries from poly-adenylated RNA. Samples were sequenced on the PacBio long-read platform and analysed using a custom bioinformatic pipeline.
Results
HBV-targeted long-read DNA sequencing generated high coverage data spanning entire integrations. Strikingly, in 13 of 42 samples (31%) we were able to detect HBV sequences flanked by 2 different chromosomes, indicating a chromosomal translocation associated with HBV integration. Chromosomal translocations were unique to each biopsy sample, suggesting that each originated randomly, and in some cases had evidence of clonal expansion. Using targeted long-read RNA sequencing, we determined that upwards of 95% of all HBV transcripts in patients who are HBeAg-positive originate from cccDNA. In contrast, patients who are HBeAg-negative expressed mostly HBsAg from integrations.
Conclusions
Targeted lso-Seq allowed for accurate quantitation of the HBV transcriptome and assignment of transcripts to either cccDNA or integration origins. The existence of multiple unique HBV-associated inter-chromosomal translocations in non-HCC CHB patient liver biopsies suggests a novel mechanism with mutagenic potential that may contribute to progression to HCC
Nanopore native RNA sequencing of a human poly(A) transcriptome
High-throughput complementary DNA sequencing technologies have advanced our understanding of transcriptome complexity and regulation. However, these methods lose information contained in biological RNA because the copied reads are often short and modifications are not retained. We address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies. Our study generated 9.9 million aligned sequence reads for the human cell line GM12878, using thirty MinION flow cells at six institutions. These native RNA reads had a median length of 771 bases, and a maximum aligned length of over 21,000 bases. Mitochondrial poly(A) reads provided an internal measure of read-length quality. We combined these long nanopore reads with higher accuracy short-reads and annotated GM12878 promoter regions to identify 33,984 plausible RNA isoforms. We describe strategies for assessing 3′ poly(A) tail length, base modifications and transcript haplotypes
Genomic basis for RNA alterations in cancer.
Transcript alterations often result from somatic changes in cancer genomes1. Various forms of RNA alterations have been described in cancer, including overexpression2, altered splicing3 and gene fusions4; however, it is difficult to attribute these to underlying genomic changes owing to heterogeneity among patients and tumour types, and the relatively small cohorts of patients for whom samples have been analysed by both transcriptome and whole-genome sequencing. Here we present, to our knowledge, the most comprehensive catalogue of cancer-associated gene alterations to date, obtained by characterizing tumour transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)5. Using matched whole-genome sequencing data, we associated several categories of RNA alterations with germline and somatic DNA alterations, and identified probable genetic mechanisms. Somatic copy-number alterations were the major drivers of variations in total gene and allele-specific expression. We identified 649 associations of somatic single-nucleotide variants with gene expression in cis, of which 68.4% involved associations with flanking non-coding regions of the gene. We found 1,900 splicing alterations associated with somatic mutations, including the formation of exons within introns in proximity to Alu elements. In addition, 82% of gene fusions were associated with structural variants, including 75 of a new class, termed 'bridged' fusions, in which a third genomic location bridges two genes. We observed transcriptomic alteration signatures that differ between cancer types and have associations with variations in DNA mutational signatures. This compendium of RNA alterations in the genomic context provides a rich resource for identifying genes and mechanisms that are functionally implicated in cancer
High-coverage whole-genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations.
The impact of somatic structural variants (SVs) on gene expression in cancer is largely unknown. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data and RNA sequencing from a common set of 1220 cancer cases, we report hundreds of genes for which the presence within 100 kb of an SV breakpoint associates with altered expression. For the majority of these genes, expression increases rather than decreases with corresponding breakpoint events. Up-regulated cancer-associated genes impacted by this phenomenon include TERT, MDM2, CDK4, ERBB2, CD274, PDCD1LG2, and IGF2. TERT-associated breakpoints involve ~3% of cases, most frequently in liver biliary, melanoma, sarcoma, stomach, and kidney cancers. SVs associated with up-regulation of PD1 and PDL1 genes involve ~1% of non-amplified cases. For many genes, SVs are significantly associated with increased numbers or greater proximity of enhancer regulatory elements near the gene. DNA methylation near the promoter is often increased with nearby SV breakpoint, which may involve inactivation of repressor elements
Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples
Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
Recommended from our members
Investigating cancer-associated pre-mRNA splicing alterations using short and long-read sequencing technologies
Pre-mRNA splicing is a highly regulated step during gene expression and has been shown to be commonly altered across cancers. The basis for splicing alterations and the functional importance of cancer-associated spliced products remain largely unexplored. The scope of this work aims to better understand the basis for cancer-associated splicing alterations and their functional importance. We first focus on establishing the genetic basis for cancer-associated splicing alterations. As part of the Pan Cancer Analysis of Whole Genomes (PCAWG) consortium, we demonstrate the impact of non-coding intronic mutations by using matched whole-genome and RNA-sequencing data across 1,209 primary tumor samples spanning 27 cancer types. We identify intronic sites beyond canonical acceptor and donor dinucleotides that are sensitive to mutations, including the branchpoint consensus sequences, which is typically missed in exome sequencing based tumor genotyping. We identify tumor suppressor genes and oncogenes with intronic mutations associated with substantial changes in splicing, and identify previously described alterations in the oncogene EZH2, as well as uncharacterized changes in oncogenes MET and HRAS. Altogether, this work provides the first estimates of the extent to which intronic mutations missed by exome-based genotyping contribute to splicing changes in cancer. The second half of my work reveals the fate and function of spliced products associated with lung adenocarcinoma mutations in the splicing factor U2AF1. We conduct high-throughput long-read cDNA sequencing in isogenic human bronchial epithelial cells with and without U2AF1 S34F mutation. We demonstrate the utility of our long-read approach for transcriptome studies by identifying 49,366 novel isoforms exclusive to our approach. We show that our long-read data is robust for capturing mutant U2AF1-associated transcriptome alterations by comparing event-level alternative splicing changes with a short-read approach. We identify isoform-level expression changes in 198 isoforms, including a novel lncRNA, and immune-related genes. Last, we hypothesize a mechanism by which U2AF1 S34F alters translational control of genes through modulating isoform diversity
Recommended from our members
Investigating cancer-associated pre-mRNA splicing alterations using short and long-read sequencing technologies
Pre-mRNA splicing is a highly regulated step during gene expression and has been shown to be commonly altered across cancers. The basis for splicing alterations and the functional importance of cancer-associated spliced products remain largely unexplored. The scope of this work aims to better understand the basis for cancer-associated splicing alterations and their functional importance. We first focus on establishing the genetic basis for cancer-associated splicing alterations. As part of the Pan Cancer Analysis of Whole Genomes (PCAWG) consortium, we demonstrate the impact of non-coding intronic mutations by using matched whole-genome and RNA-sequencing data across 1,209 primary tumor samples spanning 27 cancer types. We identify intronic sites beyond canonical acceptor and donor dinucleotides that are sensitive to mutations, including the branchpoint consensus sequences, which is typically missed in exome sequencing based tumor genotyping. We identify tumor suppressor genes and oncogenes with intronic mutations associated with substantial changes in splicing, and identify previously described alterations in the oncogene EZH2, as well as uncharacterized changes in oncogenes MET and HRAS. Altogether, this work provides the first estimates of the extent to which intronic mutations missed by exome-based genotyping contribute to splicing changes in cancer. The second half of my work reveals the fate and function of spliced products associated with lung adenocarcinoma mutations in the splicing factor U2AF1. We conduct high-throughput long-read cDNA sequencing in isogenic human bronchial epithelial cells with and without U2AF1 S34F mutation. We demonstrate the utility of our long-read approach for transcriptome studies by identifying 49,366 novel isoforms exclusive to our approach. We show that our long-read data is robust for capturing mutant U2AF1-associated transcriptome alterations by comparing event-level alternative splicing changes with a short-read approach. We identify isoform-level expression changes in 198 isoforms, including a novel lncRNA, and immune-related genes. Last, we hypothesize a mechanism by which U2AF1 S34F alters translational control of genes through modulating isoform diversity
Recommended from our members
RBM25 is a global splicing factor promoting inclusion of alternatively spliced exons and is itself regulated by lysine mono-methylation
In eukaryotes, precursor mRNA (pre-mRNA) splicing removes non-coding intron sequences to produce mature mRNA. This removal is controlled in part by RNA-binding proteins that regulate alternative splicing decisions through interactions with the splicing machinery. RNA binding motif protein 25 (RBM25) is a putative splicing factor strongly conserved across eukaryotic lineages. However, the role of RBM25 in global splicing regulation and its cellular functions are unknown. Here we show that RBM25 is required for the viability of multiple human cell lines, suggesting that it could play a key role in pre-mRNA splicing. Indeed, transcriptome-wide analysis of splicing events demonstrated that RBM25 regulates a large fraction of alternatively spliced exons throughout the human genome. Moreover, proteomic analysis indicated that RBM25 interacts with components of the early spliceosome and regulators of alternative splicing. Previously, we identified an RBM25 species that is mono-methylated at lysine 77 (RBM25K77me1), and here we used quantitative mass spectrometry to show that RBM25K77me1 is abundant in multiple human cell lines. We also identified a region of RBM25 spanning Lys-77 that binds with high affinity to serine- and arginine-rich splicing factor 2 (SRSF2), a crucial protein in exon definition, but only when Lys-77 is unmethylated. Together, our findings uncover a pivotal role for RBM25 as an essential regulator of alternative splicing and reveal a new potential mechanism for regulation of pre-mRNA splicing by lysine methylation of a splicing factor