3 research outputs found
Decoding coalescent hidden Markov models in linear time
In many areas of computational biology, hidden Markov models (HMMs) have been
used to model local genomic features. In particular, coalescent HMMs have been
used to infer ancient population sizes, migration rates, divergence times, and
other parameters such as mutation and recombination rates. As more loci,
sequences, and hidden states are added to the model, however, the runtime of
coalescent HMMs can quickly become prohibitive. Here we present a new algorithm
for reducing the runtime of coalescent HMMs from quadratic in the number of
hidden time states to linear, without making any additional approximations. Our
algorithm can be incorporated into various coalescent HMMs, including the
popular method PSMC for inferring variable effective population sizes. Here we
implement this algorithm to speed up our demographic inference method diCal,
which is equivalent to PSMC when applied to a sample of two haplotypes. We
demonstrate that the linear-time method can reconstruct a population size
change history more accurately than the quadratic-time method, given similar
computation resources. We also apply the method to data from the 1000 Genomes
project, inferring a high-resolution history of size changes in the European
population.Comment: 18 pages, 5 figures. To appear in the Proceedings of the 18th Annual
International Conference on Research in Computational Molecular Biology
(RECOMB 2014). The final publication is available at link.springer.co
Inference of Population History using Coalescent HMMs: Review and Outlook
Studying how diverse human populations are related is of historical and
anthropological interest, in addition to providing a realistic null model for
testing for signatures of natural selection or disease associations.
Furthermore, understanding the demographic histories of other species is
playing an increasingly important role in conservation genetics. A number of
statistical methods have been developed to infer population demographic
histories using whole-genome sequence data, with recent advances focusing on
allowing for more flexible modeling choices, scaling to larger data sets, and
increasing statistical power. Here we review coalescent hidden Markov models, a
powerful class of population genetic inference methods that can effectively
utilize linkage disequilibrium information. We highlight recent advances, give
advice for practitioners, point out potential pitfalls, and present possible
future research directions.Comment: 12 pages, 2 figure