In the past several years, we have witnessed numerous human genetic studies that have systematically evaluated the contribution of genetic polymorphisms to various complex diseases, and enabled the evolution of multiple treatment strategies, particularly pharmaceutical therapies. Genotype imputation has been a key step in such studies - increasing the power of gene mapping analyses, facilitating harmonization of results across studies, and accelerating fine-mapping efforts. Imputation requires access to a reference panel of densely sequenced genomes and is a computationally intensive process, even with modern high performance computing. Furthermore, reference panels often have data privacy issues that inhibit users from having direct access to the data. The goal of this dissertation is to design novel strategies to address these challenges for the next generation of imputation methods.
In the first project, I describe our efforts to create a reference panel of ~32,000 individuals with ~40M variants by combining genetic information obtained across 20 whole genome sequencing studies (Haplotype Reference Consortium). In the second project, I describe a novel idea called ‘state space reduction’ that reduces computational requirements of genotype imputation by orders of magnitude without any loss of accuracy (minimac3). I also present a web-based platform for imputation that greatly improves user experience and productivity. In the third project, I extend the idea of state space reduction by implementing a more complex version of the strategy that produces additional cost savings (minimac4). In the fourth project, I introduce the idea of meta-imputation: a novel approach that integrates imputed data from multiple reference panels at overlapping sites without interfering in the imputation algorithm (MetaMinimac).
In summary, the purpose of this dissertation research is to develop statistical methods and computational tools that will benefit other researchers in the next generation of human gene mapping studies. These imputation tools will detect rare variants with higher accuracy, consequently increasing the power of association studies.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138466/1/sayantan_1.pd