Characterising the dynamics of repeat expansion in Huntington’s disease using single-molecule long-read DNA sequencing

Abstract

Huntington’s disease (HD) is a fatal neurodegenerative disease caused by the expansion of the CAG repeat in the huntingtin gene (HTT). The length of the CAG repeat is inversely correlated with the age at disease onset. However, onset varies considerably between individuals with the same repeat length, and other genetic variants have been identified as modifiers of age at onset of HD. These include SNPs in vicinity of FAN1, a nuclease involved in DNA repair, and changes to the sequence in and around the CAG repeat itself. Expansion of the HTT CAG tract from the inherited length is seen in both germline and somatic cells in HD. Striatal projection neurons exhibit the most somatic expansion and are also the cell type most susceptible to degeneration. Repeat expansion is recapitulated in a neuronal cell model derived from an individual with juvenile HD and 109 CAGs, however, traditional methods of quantifying the repeat have limited accuracy at this size and provide no information about the sequence of the repeat. Short-read next-generation sequencing (NGS) platforms do not span repeats of this length and thus cannot provide the repeat size. Long-read NGS platforms can generate highly accurate reads of more than 20 kilobases, which is long enough to span the repeats found in these models. In the first part of this thesis, I assess the utility of long-read PacBio sequencing in measuring the size and instability of the HTT CAG repeat in samples with various repeat lengths. In the second part of this thesis, I assess the utility of long-read PacBio sequencing in measuring the size, instability, and sequence of the HTT CAG repeat in a neuronal cell model of HD and conduct experiments looking at the effect of FAN1 genotype and cell maturity on repeat length, instability, and sequence variation

    Similar works