Methods to identify novel disease genes and uplift diagnosis rates in rare diseases

Abstract

Since the advent of next generation sequencing technologies, the ability to diagnose rare diseases has improved considerably. Yet despite advances, most rare diseases remain undiagnosed. In part, this is due to a demand for more efficient methods to interpret genomic sequencing data, in addition to the need to establish the phenotypic consequence of variants in genes not yet associated with disease. This thesis describes the development and testing of novel methods to improve diagnostic efficiency in patients with rare diseases, in addition to the discovery of novel disease-gene relationships. Herein describes the DeNovoLOEUF method, which identifies putative pathogenic de novo, loss-of-function variants in both known disease and putative disease genes. The gene-agnostic HiPPo protocol is further described, which prioritises variants identified in sequencing data. Finally, application of the GenePy dimensionality reduction algorithm to identify missed biallelic diagnoses is discussed. DeNovoLOEUF was applied in established disease genes to ~14,000 trios recruited to the 100,000 Genomes Project (100KGP). In total, 98% of all variants identified were proven diagnostic, including 39 new diagnoses missed by 100KGP. DeNovoLOEUF was then applied to novel genes to the same 100KGP cohort. A total of 18 putative disease genes were identified, whereby 12/18 (67%) of these genes have since been functionally validated. For the remaining 6 genes, case series are underway and two of these with supportive functional evidence are presented in this thesis: DDX17 (comprising 11 patients with de novo monoallelic variants and neurodevelopmental phenotypes, named Seaby-Ennis Syndrome); and HDLBP (comprising 7 patients with de novo monoallelic variants and neurodevelopmental phenotypes). Finally, application of the HiPPo protocol was demonstrated to be an effective, efficient, alternative method to interpret genomic data, capable of outperforming strategies used by the NHS Genomic Medicine Service (GMS). The GMS utilises gene panels to analyse sequence data, whereas HiPPo is a panel-agnostic method that prioritises variants using in silico metrics. HiPPo had a superior diagnostic rate per number of variant assessed when compared with the GMS (20% vs 3% respectively). HiPPo further identified all pathogenic variants reported by the GMS and identified an additional missed pathogenic variant. Data presented in this thesis demonstrate how novel methods applied to genomic sequencing data can efficiently enhance diagnosis rates for patients with rare diseases and identify new disease-gene relationships. In turn, these can improve patient outcomes by better elucidating mechanistic understanding of disease, identify novel therapeutic targets, and tailor treatments to specific diseases and individuals. To fully realise the potential of novel methods, additional research is needed. Future plans will involve the use of artificial intelligence to refine methods and models for improved clinical outcomes

    Similar works

    Full text

    thumbnail-image

    Available Versions