9 research outputs found

    Comparison of in silico strategies to prioritize rare genomic variants impacting RNA splicing for the diagnosis of genomic disorders

    Get PDF
    From Springer Nature via Jisc Publications RouterHistory: received 2021-03-09, accepted 2021-09-13, registration 2021-10-01, online 2021-10-18, pub-electronic 2021-10-18, collection 2021-12Publication status: PublishedFunder: Wellcome Trust; Grant(s): RP-2016-07-011, 200990/Z/16/ZFunder: Health Education EnglandAbstract: The development of computational methods to assess pathogenicity of pre-messenger RNA splicing variants is critical for diagnosis of human disease. We assessed the capability of eight algorithms, and a consensus approach, to prioritize 249 variants of uncertain significance (VUSs) that underwent splicing functional analyses. The capability of algorithms to differentiate VUSs away from the immediate splice site as being ‘pathogenic’ or ‘benign’ is likely to have substantial impact on diagnostic testing. We show that SpliceAI is the best single strategy in this regard, but that combined usage of tools using a weighted approach can increase accuracy further. We incorporated prioritization strategies alongside diagnostic testing for rare disorders. We show that 15% of 2783 referred individuals carry rare variants expected to impact splicing that were not initially identified as ‘pathogenic’ or ‘likely pathogenic’; one in five of these cases could lead to new or refined diagnoses

    Machine Learning Approaches for the Prioritization of Genomic Variants Impacting Pre-mRNA Splicing

    No full text
    Defects in pre-mRNA splicing are frequently a cause of Mendelian disease. Despite the advent of next-generation sequencing, allowing a deeper insight into a patient’s variant landscape, the ability to characterize variants causing splicing defects has not progressed with the same speed. To address this, recent years have seen a sharp spike in the number of splice prediction tools leveraging machine learning approaches, leaving clinical geneticists with a plethora of choices for in silico analysis. In this review, some basic principles of machine learning are introduced in the context of genomics and splicing analysis. A critical comparative approach is then used to describe seven recent machine learning-based splice prediction tools, revealing highly diverse approaches and common caveats. We find that, although great progress has been made in producing specific and sensitive tools, there is still much scope for personalized approaches to prediction of variant impact on splicing. Such approaches may increase diagnostic yields and underpin improvements to patient care

    Machine learning approaches for the prioritization of genomic variants impacting pre-mRNA splicing

    No full text
    Defects in pre-mRNA splicing are frequently a cause of Mendelian disease. Despite the advent of next-generation sequencing, allowing a deeper insight into a patient’s variant landscape, the ability to characterize variants causing splicing defects has not progressed with the same speed. To address this, recent years have seen a sharp spike in the number of splice prediction tools leveraging machine learning approaches, leaving clinical geneticists with a plethora of choices for in silico analysis. In this review, some basic principles of machine learning are introduced in the context of genomics and splicing analysis. A critical comparative approach is then used to describe seven recent machine learning-based splice prediction tools, revealing highly diverse approaches and common caveats. We find that, although great progress has been made in producing specific and sensitive tools, there is still much scope for personalized approaches to prediction of variant impact on splicing. Such approaches may increase diagnostic yields and underpin improvements to patient care

    Modelling the developmental spliceosomal craniofacial disorder Burn-McKeown syndrome using induced pluripotent stem cells

    No full text
    The craniofacial developmental disorder Burn-McKeown Syndrome (BMKS) is caused by biallelic variants in the pre-messenger RNA splicing factor gene TXNL4A/DIB1. The majority of affected individuals with BMKS have a 34 base pair deletion in the promoter region of one allele of TXNL4A combined with a loss-of-function variant on the other allele, resulting in reduced TXNL4A expression. However, it is unclear how reduced expression of this ubiquitously expressed spliceosome protein results in craniofacial defects during development. Here we reprogrammed peripheral mononuclear blood cells from a BMKS patient and her unaffected mother into induced pluripotent stem cells (iPSCs) and differentiated the iPSCs into induced neural crest cells (iNCCs), the key cell type required for correct craniofacial development. BMKS patient-derived iPSCs proliferated more slowly than both mother- and unrelated control-derived iPSCs, and RNA-Seq analysis revealed significant differences in gene expression and alternative splicing. Patient iPSCs displayed defective differentiation into iNCCs compared to maternal and unrelated control iPSCs, in particular a delay in undergoing an epithelial-to-mesenchymal transition (EMT). RNA-Seq analysis of differentiated iNCCs revealed widespread gene expression changes and mis-splicing in genes relevant to craniofacial and embryonic development that highlight a dampened response to WNT signalling, the key pathway activated during iNCC differentiation. Furthermore, we identified the mis-splicing of TCF7L2 exon 4, a key gene in the WNT pathway, as a potential cause of the downregulated WNT response in patient cells. Additionally, mis-spliced genes shared common sequence properties such as length, branch point to 3' splice site (BPS-3'SS) distance and splice site strengths, suggesting that splicing of particular subsets of genes is particularly sensitive to changes in TXNL4A expression. Together, these data provide the first insight into how reduced TXNL4A expression in BMKS patients might compromise splicing and NCC function, resulting in defective craniofacial development in the embryo

    Re-evaluation of Missense Variant Classifications in NF2

    No full text
    Missense variants in the NF2 gene result in variable NF2 disease presentation. Clinical classification of missense variants often represents a challenge, due to lack of evidence for pathogenicity and function. This study provides a summary of NF2 missense variants, with variant classifications based on currently available evidence. NF2 missense variants were collated from pathology‐associated databases and existing literature. Association for Clinical Genomic Sciences Best Practice Guidelines (2020) were followed in the application of evidence for variant interpretation and classification. The majority of NF2 missense variants remain classified as variants of uncertain significance. However, NF2 missense variants identified in gnomAD occurred at a consistent rate across the gene, while variants compiled from pathology‐associated databases displayed differing rates of variation by exon of NF2. The highest rate of NF2 disease‐associated variants was observed in exon 7, while lower rates were observed toward the C‐terminus of the NF2 protein, merlin. Further phenotypic information associated with variants, alongside variant‐specific functional analysis, is necessary for more definitive variant interpretation. Our data identified differences in frequency of NF2 missense variants by exon between gnomAD population data and NF2 disease‐associated variants, suggesting a potential genotype‐phenotype correlation; further work is necessary to substantiate this

    Utility of polygenic risk scores in UK cancer screening:a modelling analysis

    Get PDF
    This work was funded by a Wellcome Trust Clinical Research Fellowship. CH is supported by a Wellcome Trust Clinical Research Training Fellowship (203924/Z/16/Z). RSH acknowledges grant support from Cancer Research UK (C1298/A8362) and the Wellcome Trust (214388). AS is in receipt of a National Institute for Health Research (NIHR) Academic Clinical Lectureship, funding from the Royal Marsden Biomedical Research Centre, and is recipient of the Whitney-Wood Scholarship from the Royal College of Physicians. CT, CFR, HH, KS, RW, and BT acknowledge grant support from Cancer Research UK (C8620/A8372). MEJ receives funding from Breast Cancer Now.It is proposed that, through restriction to individuals delineated as high risk, polygenic risk scores (PRSs) might enable more efficient targeting of existing cancer screening programmes and enable extension into new age ranges and disease types. To address this proposition, we present an overview of the performance of PRS tools (ie, models and sets of single nucleotide polymorphisms) alongside harms and benefits of PRS-stratified cancer screening for eight example cancers (breast, prostate, colorectal, pancreas, ovary, kidney, lung, and testicular cancer). For this modelling analysis, we used age-stratified cancer incidences for the UK population from the National Cancer Registration Dataset (2016-18) and published estimates of the area under the receiver operating characteristic curve for current, future, and optimised PRS for each of the eight cancer types. For each of five PRS-defined high-risk quantiles (ie, the top 50%, 20%, 10%, 5%, and 1%) and according to each of the three PRS tools (ie, current, future, and optimised) for the eight cancers, we calculated the relative proportion of cancers arising, the odds ratios of a cancer arising compared with the UK population average, and the lifetime cancer risk. We examined maximal attainable rates of cancer detection by age stratum from combining PRS-based stratification with cancer screening tools and modelled the maximal impact on cancer-specific survival of hypothetical new UK programmes of PRS-stratified screening. The PRS-defined high-risk quintile (20%) of the population was estimated to capture 37% of breast cancer cases, 46% of prostate cancer cases, 34% of colorectal cancer cases, 29% of pancreatic cancer cases, 26% of ovarian cancer cases, 22% of renal cancer cases, 26% of lung cancer cases, and 47% of testicular cancer cases. Extending UK screening programmes to a PRS-defined high-risk quintile including people aged 40-49 years for breast cancer, 50-59 years for colorectal cancer, and 60-69 years for prostate cancer has the potential to avert, respectively, a maximum of 102, 188, and 158 deaths annually. Unstratified screening of the full population aged 48-49 years for breast cancer, 58-59 years for colorectal cancer, and 68-69 years for prostate cancer would use equivalent resources and avert, respectively, an estimated maximum of 80, 155, and 95 deaths annually. These maximal modelled numbers will be substantially attenuated by incomplete population uptake of PRS profiling and cancer screening, interval cancers, non-European ancestry, and other factors. Under favourable assumptions, our modelling suggests modest potential efficiency gain in cancer case detection and deaths averted for hypothetical new PRS-stratified screening programmes for breast, prostate, and colorectal cancer. Restriction of screening to high-risk quantiles means many or most incident cancers will arise in those assigned as being low-risk. To quantify real-world clinical impact, costs, and harms, UK-specific cluster-randomised trials are required.Publisher PDFPeer reviewe

    MRSD: a quantitative approach for assessing suitability of RNA-seq in the investigation of mis-splicing in Mendelian disease

    No full text
    Background: RNA-sequencing of patient biosamples is a promising approach to delineate the impact of genomic variants on splicing, but variable gene expression between tissues complicates selection of appropriate tissues. Relative expression level is often used as a metric to predict RNA-sequencing utility. Here, we describe a gene- and tissue-specific metric to inform the feasibility of RNA-sequencing, overcoming some issues with using expression values alone.Results: We derive a novel metric, Minimum Required Sequencing Depth (MRSD), for all genes across three human biosamples (whole blood, lymphoblastoid cell lines (LCLs) and skeletal muscle). MRSD estimates the depth of sequencing required from RNA-sequencing to achieve user-specified sequencing coverage of a gene, transcript or group of genes of interest. MRSD predicts levels of splice junction coverage with high precision (90.1-98.2%) and overcomes transcript region-specific sequencing biases. Applying MRSD scoring to established disease gene panels shows that LCLs are the optimum source of RNA, of the three investigated biosamples, for 69.3% of gene panels. Our approach demonstrates that up to 59.4% of variants of uncertain significance in ClinVar predicted to impact splicing could be functionally assayed by RNA-sequencing in at least one of the investigated biosamples.Conclusions: We demonstrate the power of MRSD as a metric to inform choice of appropriate biosamples for the functional assessment of splicing aberrations. We apply MRSD in the context of Mendelian genetic disorders and illustrate its benefits over expression-based approaches. We anticipate that the integration of MRSD into clinical pipelines will improve variant interpretation and, ultimately, diagnostic yield
    corecore