122 research outputs found

    Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing

    Get PDF
    Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon insertions from sequencing data, to whole genomes from 200 tumor/normal pairs across 11 tumor types as part of The Cancer Genome Atlas (TCGA) Pan-Cancer Project. In addition to novel germline polymorphisms, we find 810 somatic retrotransposon insertions primarily in lung squamous, head and neck, colorectal, and endometrial carcinomas. Many somatic retrotransposon insertions occur in known cancer genes. We find that high somatic retrotransposition rates in tumors are associated with high rates of genomic rearrangement and somatic mutation. Finally, we developed TranspoSeq-Exome to interrogate an additional 767 tumor samples with hybrid-capture exome data and discovered 35 novel somatic retrotransposon insertions into exonic regions, including an insertion into an exon of the PTEN tumor suppressor gene. The results of this large-scale, comprehensive analysis of retrotransposon movement across tumor types suggest that somatic retrotransposon insertions may represent an important class of structural variation in cancer.National Cancer Institute (U.S.) (grant U24CA143867)National Cancer Institute (U.S.) (grant U24CA126546

    Whole-genome and multisector exome sequencing of primary and post-treatment glioblastoma reveals patterns of tumor evolution

    Get PDF
    Glioblastoma (GBM) is a prototypical heterogeneous brain tumor refractory to conventional therapy. A small residual population of cells escapes surgery and chemoradiation, resulting in a typically fatal tumor recurrence ~7 mo after diagnosis. Understanding the molecular architecture of this residual population is critical for the development of successful therapies. We used whole-genome sequencing and whole-exome sequencing of multiple sectors from primary and paired recurrent GBM tumors to reconstruct the genomic profile of residual, therapy resistant tumor initiating cells. We found that genetic alteration of the p53 pathway is a primary molecular event predictive of a high number of subclonal mutations in glioblastoma. The genomic road leading to recurrence is highly idiosyncratic but can be broadly classified into linear recurrences that share extensive genetic similarity with the primary tumor and can be directly traced to one of its specific sectors, and divergent recurrences that share few genetic alterations with the primary tumor and originate from cells that branched off early during tumorigenesis. Our study provides mechanistic insights into how genetic alterations in primary tumors impact the ensuing evolution of tumor cells and the emergence of subclonal heterogeneity

    Colon cancer-derived oncogenic EGFR G724S mutant identified by whole genome sequence analysis is dependent on asymmetric dimerization and sensitive to cetuximab

    Get PDF
    Background: Inhibition of the activated epidermal growth factor receptor (EGFR) with either enzymatic kinase inhibitors or anti-EGFR antibodies such as cetuximab, is an effective modality of treatment for multiple human cancers. Enzymatic EGFR inhibitors are effective for lung adenocarcinomas with somatic kinase domain EGFR mutations while, paradoxically, anti-EGFR antibodies are more effective in colon and head and neck cancers where EGFR mutations occur less frequently. In colorectal cancer, anti-EGFR antibodies are routinely used as second-line therapy of KRAS wild-type tumors. However, detailed mechanisms and genomic predictors for pharmacological response to these antibodies in colon cancer remain unclear. Findings: We describe a case of colorectal adenocarcinoma, which was found to harbor a kinase domain mutation, G724S, in EGFR through whole genome sequencing. We show that G724S mutant EGFR is oncogenic and that it differs from classic lung cancer derived EGFR mutants in that it is cetuximab responsive in vitro, yet relatively insensitive to small molecule kinase inhibitors. Through biochemical and cellular pharmacologic studies, we have determined that cells harboring the colon cancer-derived G719S and G724S mutants are responsive to cetuximab therapy in vitro and found that the requirement for asymmetric dimerization of these mutant EGFR to promote cellular transformation may explain their greater inhibition by cetuximab than small-molecule kinase inhibitors. Conclusion: The colon-cancer derived G719S and G724S mutants are oncogenic and sensitive in vitro to cetuximab. These data suggest that patients with these mutations may benefit from the use of anti-EGFR antibodies as part of the first-line therapy

    Genetic Mapping and Exome Sequencing Identify Variants Associated with Five Novel Diseases

    Get PDF
    The Clinic for Special Children (CSC) has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain) children. Among the Plain people, we have used single nucleotide polymorphism (SNP) microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb) that contain many genes (mean = 79). For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data

    The functional spectrum of low-frequency coding variation

    Get PDF
    Background Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. Results The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. Conclusions This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variatio

    Mutations causing medullary cystic kidney disease type 1 (MCKD1) lie in a large VNTR in MUC1 missed by massively parallel sequencing

    Get PDF
    While genetic lesions responsible for some Mendelian disorders can be rapidly discovered through massively parallel sequencing (MPS) of whole genomes or exomes, not all diseases readily yield to such efforts. We describe the illustrative case of the simple Mendelian disorder medullary cystic kidney disease type 1 (MCKD1), mapped more than a decade ago to a 2-Mb region on chromosome 1. Ultimately, only by cloning, capillary sequencing, and de novo assembly, we found that each of six MCKD1 families harbors an equivalent, but apparently independently arising, mutation in sequence dramatically underrepresented in MPS data: the insertion of a single C in one copy (but a different copy in each family) of the repeat unit comprising the extremely long (~1.5-5 kb), GC-rich (>80%), coding VNTR in the mucin 1 gene. The results provide a cautionary tale about the challenges in identifying genes responsible for Mendelian, let alone more complex, disorders through MPS
    corecore