Genome instability, defined as an increased tendency of genome alteration, is the cause of many human diseases and conditions. It is a hallmark of human cancer and plays a role in aging and the development and function of the nervous system. Genome instability can manifest in several ways, including gaps and breaks at Common Fragile Sites (CFSs) and Copy Number Variants (CNVs). CFSs are sites on human metaphase chromosomes prone to forming gaps or breaks following replication stress. CNVs are submicroscopic genomic alternations that change the copy number of the affected region, also often following replication stress. The genome regions most prone to replication stress-induced CNVs, called “hotspots,” coincide with CFSs.
In spite of their implications for human health, mechanisms leading to instability at CFSs and CNV hotspots are unclear. CFSs/CNV hotspots are AT-rich and late replicating, but those properties are not sufficient for the sites’ instability. DNA sequence at CFSs/CNV hotspots is shared among all cells, but instability is cell line-specific. We also found that while about 20% of the genome replicates late, hotspots only comprise 0.4% of the genome. Hence, instability at hotspots is determined by properties that vary between different cell lines and genomic regions.
Transcription is one such property. We found that CFSs/CNV hotspots are enriched in large (>500kb), transcribed genes and that given a cell line’s transcription profile we can predict where CFSs/CNV hotspots will be in that cell line. I further show that abrogating expression of a large hotspot gene leads to a reduced number of aphidicolin-induced CNVs. These results established transcription of large genes as a determining factor for instability at hotspots.
We propose that a conflict between transcription of large genes and DNA replication drives hotspot instability. I tested a model in which R-loops (RNA/DNA hybrids) create a physical interference for the replication fork and cause the fork to stall and initiate genomic alteration. R-loop manipulation by altering expression of RNase H1 had no significant effect on the frequency of APH-induced instability at hotspots, implying that R-loops do not play a central role in driving APH-induced CNVs, unlike a prior study showing that R-loop manipulation changes CFS instability. However, R-loop accumulation changes the location of breakpoints of these CNVs and change the frequency of the spontaneous CNVs, suggesting that R-loops may still play a role in both APH-induced and spontaneous CNV formation.
In sum, the studies in this dissertation reveal that transcription of unusually large genes plays a pivotal role in instability at CFSs/CNV hotspots during replication stress, but not via an R-loop-associated mechanism. Nonetheless, R-loops threaten genome instability and affect CNV formation outside of hotspots. Future studies are necessary to explore other transcription-replication conflict models at CFSs/CNV hotspots and further characterize R-loop induced CNVs.PHDHuman GeneticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147729/1/sohae_1.pd