3 research outputs found
A minimal descriptor of an ancestral recombinations graph
<p>Abstract</p> <p>Background</p> <p>Ancestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations.</p> <p>Results</p> <p>In this paper, we identify structure-preserving and samples-preserving core of an ARG <it>G</it> and call it the minimal descriptor ARG of <it>G</it>. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of <it>G</it> and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of <it>G</it>. We also prove that even an unbounded <it>G</it> has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of <it>G</it> and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential.</p> <p>Conclusions</p> <p>Based on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6).</p
Population genetics of identity by descent
Recent improvements in high-throughput genotyping and sequencing technologies
have afforded the collection of massive, genome-wide datasets of DNA
information from hundreds of thousands of individuals. These datasets, in turn,
provide unprecedented opportunities to reconstruct the history of human
populations and detect genotype-phenotype association. Recently developed
computational methods can identify long-range chromosomal segments that are
identical across samples, and have been transmitted from common ancestors that
lived tens to hundreds of generations in the past. These segments reveal
genealogical relationships that are typically unknown to the carrying
individuals. In this work, we demonstrate that such identical-by-descent (IBD)
segments are informative about a number of relevant population genetics
features: they enable the inference of details about past population size
fluctuations, migration events, and they carry the genomic signature of natural
selection. We derive a mathematical model, based on coalescent theory, that
allows for a quantitative description of IBD sharing across purportedly
unrelated individuals, and develop inference procedures for the reconstruction
of recent demographic events, where classical methodologies are statistically
underpowered. We analyze IBD sharing in several contemporary human populations,
including representative communities of the Jewish Diaspora, Kenyan Maasai
samples, and individuals from several Dutch provinces, in all cases retrieving
evidence of fine-scale demographic events from recent history. Finally, we
expand the presented model to describe distributions for those sites in IBD
shared segments that harbor mutation events, showing how these may be used for
the inference of mutation rates in humans and other species.Comment: Ph.D. thesi