Inferring Biological Population Membership: An Exploration of the Continuum of Genetic Relationships.

Abstract

To easily collect samples in a genetic study, we use proxies for membership in biological populations. This means we are often placing or assigning individuals on the basis of operational designations - closer relationships being family, while distant relationships being population membership or even more distant relationships- being ancestrally related population relationships. While there is a correlation between genetic variation and the operational designation placed on individuals, the designation does not necessarily define the genetic relationship. Genetic relationships exist in a continuum and mapping the relationships among sample members from the proxy to the genetics is not straightforward. My dissertation examines the genetic relationships from two scales: between individuals in a population and between ancestrally related populations. I first develop a method to examine population membership using just two individuals. The homogeneity method is a statistical test of the null hypothesis that two individuals are unrelated members of the same randomly mating population. This test requires that the pair of individuals be genotyped for a battery of genetic markers, but it does not require information about the pair of individuals or the populations that they might belong to. Potential applications of this test include 1) identifying population stratification in biomedical samples, 2) solving forensic cases from molecular evidence, 3) management of endangered species, and 4) examining human population history. To examine relationships between populations, I investigate the effect of ancestral population relationships on methods designed to assess population structure. I develop a novel method to simulate multiple SNP genotypes from different populations. This method simulates realistic allele frequencies and captures the shared ancestry of populations so that the user can efficiently choose SNPs with a flexible ascertainment. I then simulate individuals from populations representing a divergent and less divergent phylogenetic tree. I use the simulated data in GHM (generalized hierarchical modeling) and STRUCTURE (Bayesian k-means clustering) to compare the true underlying ancestry. In summary, my dissertation research provides novel quantitative tools and analyses that aid in understanding the genetics of biological populations.PhDHuman GeneticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78959/1/nmscott_1.pd

    Similar works