6 research outputs found
Hereditary cancer genes are highly susceptible to splicing mutations
<div><p>Substitutions that disrupt pre-mRNA splicing are a common cause of genetic disease. On average, 13.4% of all hereditary disease alleles are classified as splicing mutations mapping to the canonical 5′ and 3′ splice sites. However, splicing mutations present in exons and deeper intronic positions are vastly underreported. A recent re-analysis of coding mutations in exon 10 of the Lynch Syndrome gene, <i>MLH1</i>, revealed an extremely high rate (77%) of mutations that lead to defective splicing. This finding is confirmed by extending the sampling to five other exons in the <i>MLH1</i> gene. Further analysis suggests a more general phenomenon of defective splicing driving Lynch Syndrome. Of the 36 mutations tested, 11 disrupted splicing. Furthermore, analyzing past reports suggest that <i>MLH1</i> mutations in canonical splice sites also occupy a much higher fraction (36%) of total mutations than expected. When performing a comprehensive analysis of splicing mutations in human disease genes, we found that three main causal genes of Lynch Syndrome, <i>MLH1</i>, <i>MSH2</i>, and <i>PMS2</i>, belonged to a class of 86 disease genes which are enriched for splicing mutations. Other cancer genes were also enriched in the 86 susceptible genes. The enrichment of splicing mutations in hereditary cancers strongly argues for additional priority in interpreting clinical sequencing data in relation to cancer and splicing.</p></div
Random forest classification and prediction of SSM-prone genes.
<p><b>A.</b> The order of variable importance by mean decease in accuracy for SSM-prone genes versus genes with an expected number of SSM. The directions that associate with SSM-prone genes are indicated, positive directions are green, and negative directions are red. <b>B.</b> Classification performance of the random forest models and the logistic regression models was calculated as the area under the curve (AUC) in receiver operating characteristic (ROC) analysis. <b>C.</b> Scheme of random forest classification on all genomic genes. <b>D.</b> Average proportion of low frequency ExAC splice-site variants per splice-site in predicted SSM-prone genes (probability: 0.60–0.86) versus genes not predicted to be SSM-prone (<i>P</i> = 6.1043e-18, Mann-Whitney). <b>E.</b> Common variants are depleted from the category of variants that cause loss of splice-site signal at the 5′ splice-site (upper plot). Rare variants are enriched in the range of the splice site signal scores that abolish 5′ splice-site recognition (lower plot).</p
Non-uniform distribution of splicing mutations across disease genes.
<p><b>A.</b> SSM versus all exonic mutations in the HGMD with regions of 99.9% confidence interval shown in gray. Genes with more, expected, and less SSM are shown in red (Upper), blue (Expected), and green (Lower), respectively. Location of <i>MLH1</i>, <i>MSH2</i>, and <i>PMS2</i> are highlighted and labeled. <b>B.</b> Percent ESM of total mutations tested using MaPSy in each category. <b>C</b>. Due to the inability of MaPSy to observe mutant-specific exon skipping events (as a result of the identical flanking exons), ESMs found in MLH1, BRCA1, and OPA1 were validated as individual wildtype and mutant minigene constructs. All three mutant constructs showed exon skipping events, which were not shown in wildtype constructs.</p
Enrichment of cancer genes in SSM-prone genes.
<p><b>A.</b> SSM versus all exonic mutations in the HGMD with regions of 99.9% confidence interval shown in gray. COSMIC cancer genes are highlighted in Red. <i>MLH1</i>, <i>BRCA1</i>, <i>BRCA2</i>, and <i>NF1</i> are highlighted and labeled. <b>B-C.</b> Average percent of SSM or ESM in cancer genes versus non-cancer genes reported in HGMD. <b>D.</b> Average HI score of cancer genes in Upper, Expected, and Lower categories of genes.</p
<i>MLH1</i> ESM affect different stages of spliceosome assembly.
<p>The percentages of mutant mRNA retained in each stage of the assembly relative to wildtype mRNA are shown for all ESM that were identified in <i>MLH1</i> exon 8 and 15. The majority of ESM were blocked in the transition from A and B complex. Two of the ESM (CM082944 and CM04546) in exon 8 also slowed down the final transesterification reactions to yield spliced mRNA and the lariat.</p
<i>MLH1</i> is frequently disrupted by splicing mutations.
<p><b>A.</b> Disease coding mutations in exons 4, 5, 7, 8 and 15 of <i>MLH1</i> were analyzed with MaPSy. While none of the mutations in exons 4, 5 and 7 (blue bars) were found to disrupt splicing, almost all of the mutations tested in exons 8 and 15 (red bars) significantly altered splicing (100% and 71%, respectively). <b>B.</b> Splicing efficiency of wildtype (blue) and mutant (red) alleles that were tested with MaPSy in exons 8 and 15 of <i>MLH1</i>.</p