48 research outputs found

    Characteristics of de novo structural changes in the human genome

    Get PDF
    Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de novo structural changes identified in whole genomes of 250 families, including complex indels, retrotransposon insertions, and interchromosomal events. These data indicate a mutation rate of 2.94 indels (1–20 bp) and 0.16 SVs (>20 bp) per generation. De novo structural changes affect on average 4.1 kbp of genomic sequence and 29 coding bases per generation, which is 91 and 52 times more nucleotides than de novo substitutions, respectively. This contrasts with the equal genomic footprint of inherited SVs and substitutions. An excess of structural changes originated on paternal haplotypes. Additionally, we observed a nonuniform distribution of de novo SVs across offspring. These results reveal the importance of different mutational mechanisms to changes in human genome structure across generations

    Chromothripsis in healthy individuals affects multiple protein-coding genes and can result in severe congenital abnormalities in offspring

    Get PDF
    Chromothripsis represents an extreme class of complex chromosome rearrangements (CCRs) with major effects on chromosomal architecture. Although recent studies have associated chromothripsis with congenital abnormalities, the incidence and pathogenic effects of this phenomenon require further investigation. Here, we analyzed the genomes of three families in which chromothripsis rearrangements were transmitted from a mother to her child. The chromothripsis in the mothers resulted in completely balanced rearrangements involving 8-23 breakpoint junctions across three to five chromosomes. Two mothers did not show any phenotypic abnormalities, although 3-13 protein-coding genes were affected by breakpoints. Unbalanced but stable transmission of a subset of the derivative chromosomes caused apparently de novo complex copy-number changes in two children. This resulted in gene-dosage changes, which are probably responsible for the severe congenital phenotypes of these two children. In contrast, the third child, who has a severe congenital disease, harbored all three chromothripsis chromosomes from his healthy mother, but one of the chromosomes acquired de novo rearrangements leading to copy-number changes. These results show that the human genome can tolerate extreme reshuffling of chromosomal architecture, including breakage of multiple protein-coding genes, without noticeable phenotypic effects. The presence of chromothripsis in healthy individuals affects reproduction and is expected to substantially increase the risk of miscarriages, abortions, and severe congenital disease. © 2015 The American Society of Human Genetics

    Partner-independent fusion gene detection by multiplexed CRISPR/Cas9 enrichment and long-read Nanopore sequencing

    Get PDF
    Fusion genes are hallmarks of various cancer types and important determinants for diagnosis, prognosis and treatment possibilities. The promiscuity of fusion genes with respect to partner choice and exact breakpoint-positions restricts their detection in the diagnostic setting, even for known and recurrent fusion gene configurations. To accurately identify these gene fusions in an unbiased manner, we developed FUDGE: a FUsion gene Detection assay from Gene Enrichment. FUDGE couples target-selected and strand-specific CRISPR/Cas9 activity for enrichment and detection of fusion gene drivers (e.g. BRAF, EWSR1, KMT2A/MLL) - without prior knowledge of fusion partner or breakpoint-location - to long-read Nanopore sequencing. FUDGE encompasses a dedicated bioinformatics approach (NanoFG) to detect fusion genes from Nanopore sequencing data. Our strategy is flexible with respect to target choice and enables multiplexed enrichment for simultaneous analysis of several genes in multiple samples in a single sequencing run. We observe on average a 508 fold on-target enrichment and identify fusion breakpoints at nucleotide resolution - all within two days. We demonstrate that FUDGE effectively identifies fusion genes in cancer cell lines, tumor samples and on whole genome amplified DNA irrespective of partner gene or breakpoint-position in 100% of cases. Furthermore, we show that FUDGE is superior to routine diagnostic methods for fusion gene detection. In summary, we have developed a rapid and versatile fusion gene detection assay, providing an unparalleled opportunity for pan-cancer detection of fusion genes in routine diagnostics

    Partner independent fusion gene detection by multiplexed CRISPR-Cas9 enrichment and long read nanopore sequencing

    Get PDF
    Fusion genes are hallmarks of various cancer types and important determinants for diagnosis, prognosis and treatment. Fusion gene partner choice and breakpoint-position promiscuity restricts diagnostic detection, even for known and recurrent configurations. Here, we develop FUDGE (FUsion Detection from Gene Enrichment) to accurately and impartially identify fusions. FUDGE couples target-selected and strand-specific CRISPR-Cas9 activity for fusion gene driver enrichment - without prior knowledge of fusion partner or breakpoint-location - to long read nanopore sequencing with the bioinformatics pipeline NanoFG. FUDGE has flexible target-loci choices and enables multiplexed enrichment for simultaneous analysis of several genes in multiple samples in one sequencing run. We observe on-average 665 fold breakpoint-site enrichment and identify nucleotide resolution fusion breakpoints within 2 days. The assay identifies cancer cell line and tumor sample fusions irrespective of partner gene or breakpoint-position. FUDGE is a rapid and versatile fusion detection assay for diagnostic pan-cancer fusion detection

    A framework for the detection of de novo mutations in family-based sequencing data

    Get PDF
    Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports

    Confirmation of a metastasis-specific microRNA signature in primary colon cancer

    Get PDF
    The identification of patients with high-risk stage II colon cancer who may benefit from adjuvant therapy may allow the clinical approach to be tailored for these patients based on an understanding of tumour biology. MicroRNAs have been proposed as markers of the prognosis or treatment response in colorectal cancer. Recently, a 2-microRNA signature (l et-7i and miR-10b) was proposed to identify colorectal cancer patients at risk of developing distant metastasis. We assessed the prognostic value of this signature and additional candidate microRNAs in an independent, clinically well-defined, prospectively collected cohort of primary colon cancer patients including stage I-II colon cancer without and stage III colon cancer with adjuvant treatment. The 2-microRNA signature specifically predicted hepatic recurrence in the stage I-II group, but not the overall ability to develop distant metastasis. The addition of miR-30b to the 2-microRNA signature allowed the prediction of both distant metastasis and hepatic recurrence in patients with stage I-II colon cancer who did not receive adjuvant chemotherapy. Available gene expression data allowed us to associate m iR-30b expression with axon guidance and l et-7i expression with cell adhesion, migration, and motility

    A systematic analysis of oncogenic gene fusions in primary colon cancer

    Get PDF
    Genomic rearrangements that give rise to oncogenic gene fusions can offer actionable targets for cancer therapy. Here we present a systematic analysis of oncogenic gene fusions among a clinically well-characterized, prospectively collected set of 278 primary colon cancers spanning diverse tumor stages and clinical outcomes. Gene fusions and somatic genetic variations were identified in fresh frozen clinical specimens by Illumina RNA-sequencing, the STAR fusion gene detection pipeline, and GATK RNA-seq variant calling. We considered gene fusions to be pathogenically relevant when recurrent, producing divergent gene expression (outlier analysis), or as functionally important (e.g., kinase fusions). Overall, 2.5% of all specimens were defined as harboring a relevant gene fusion (kinase fusions 1.8%). Novel configurations of BRAF, NTRK3, and RET gene fusions resulting from chromosomal translocations were identified. An R-spondin fusion was found in only one tumor (0.35%), much less than an earlier reported frequency of 10% in colorectal cancers. We also found a novel fusion involving USP9X-ERAS formed by chromothripsis and leading to high expression of ERAS, a constitutively active RAS protein normally expressed only in embryonic stem cells. This USP9X–ERAS fusion appeared highly oncogenic on the basis of its ability to activate AKT signaling. Oncogenic fusions were identified only in lymph node–negative tumors that lacked BRAF or KRAS mutations. In summary, we identified several novel oncogenic gene fusions in colorectal cancer that may drive malignant development and offer new targets for personalized therapy

    A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

    Get PDF
    Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals

    WGS-based telomere length analysis in Dutch family trios implicates stronger maternal inheritance and a role for RRM1 gene

    Get PDF
    Telomere length (TL) regulation is an important factor in ageing, reproduction and cancer development. Genetic, hereditary and environmental factors regulating TL are currently widely investigated, however, their relative contribution to TL variability is still understudied. We have used whole genome sequencing data of 250 family trios from the Genome of the Netherlands project to perform computational measurement of TL and a series of regression and genome-wide association analyses to reveal TL inheritance patterns and associated genetic factors. Our results confirm that TL is a largely heritable trait, primarily with mother’s, and, to a lesser extent, with father’s TL having the strongest influence on the offspring. In this cohort, mother’s, but not father’s age at conception was positively linked to offspring TL. Age-related TL attrition of 40 bp/year had relatively small influence on TL variability. Finally, we have identified TL-associated variations in ribonuclease reductase catalytic subunit M1 (RRM1 gene), which is known to regulate telomere maintenance in yeast. We also highlight the importance of multivariate approach and the limitations of existing tools for the analysis of TL as a polygenic heritable quantitative trait
    corecore