Analysis of the Formation, Binding and Processing of Alternative Nucleic Acid Structures in Disease-Associated Repeat Sequences

Abstract

Expansion of gene-specific tandem repeat sequences is the causative mutation of a growing list of neurological, neuromuscular and neurodegenerative diseases, of which there are currently over 40. The cause of the initial tandem repeat expansion mutations, the molecular mechanisms driving ongoing repeat tract instability and the causes of pathogenesis are not well understood for many of these diseases. The formation of alternative secondary nucleic acid structures in the DNA and RNA of expanded tandem repeat sequences is thought to drive repeat instability and pathogenesis by impairing normal DNA and RNA metabolic processes. My thesis involves characterization of three such alternative secondary structures that could potentially form in the DNA or RNA of disease-associated tandem repeat sequences and assessment of their processing and binding to candidate structure-specific proteins. In DNA, I assessed how the specific configuration of slipped-out 3-way junction structures formed in disease-associated (CAG)*(CTG) tandem repeats can influence protein binding and cleavage by structure-specific DNA repair proteins in vitro. Whereas previous studies focused upon slipped-DNA structures as a static entity, we show that slipped-DNA junctions exist in multiple interchanging conformations and the specific conformation of the junction can influence how it gets processed. These findings support that the junction conformation is an important modifier of disease-associated (CAG)*(CTG) instability. To further extend our understanding of nucleic acid structure in tandem repeat disease, I analyzed the structural and sequence determinants governing RNA:DNA hybrid formation and processing at various trinucleotide repeats. By using a well-established in vitro transcription assay, I demonstrated that stable R-loop formation occurs in all disease-associated trinucleotide repeats. I also identified novel double-R-loop structures formed when repeats are simultaneously bidirectionally transcribed, as occurs at many disease-associated trinucleotide repeat-containing genes. I went on to establish a novel R-loop processing assay using a cell-free extract system and demonstrated that R-loops and particularly double-R-loops can increase the instability of (CAG)*(CTG) repeats. Finally, I identified extremely stable RNA G-quadruplex structures in the amyotrophic lateral sclerosis and frontotemporal dementia (ALS-FTD)-associated (GGGGCC)n repeat in vitro, using RNA oligonucleotide models and assessed their ability to bind structure-specific RNA binding proteins. This led to the identification of the ASF/SF2 essential splicing factor as a candidate protein interactor that may have relevance to ALS-FTD. Taken together, my findings demonstrate unusual nucleic acid structure formation by disease-associated tandem repeats in the DNA, the RNA and during transcription when the DNA interacts with the RNA. These secondary structures can be differentially bound and/or processed by structure-specific proteins. Understanding the secondary structures formed by disease-associated repeat sequences and identifying the proteins that interact with them expands the potential therapeutics that can be developed to modulate pathogenesis and also expands our understanding of the normal biological roles of repeat sequences in the genome.Ph.D

    Similar works