Computational inference of haplotypes from genotypes has attracted a great deal of attention in the computational biology community recently, partially driven by the international HapMap project. In this paper, we study the question of how to efficiently infer haplotypes from genotypes of individuals related by a pedigree, assuming that the hereditary process was free of mutations (i.e. the Mendelian law of inheritance) and recombinants. The problem has recently been formulated as a system of linear equations over the finite field of F(2) and solved in O(m 3 n 3) time by using standard Gaussian elimination, where m is the number of loci (or markers) in a genotype and n the number of individuals in the pedigree. We give a much faster algorithm with running time O(mn 2 + n 3 log 2 nlog log n). The key ingredients of our construction are (i) a new system of linear equations based on some spanning tree of the pedigree graph and (ii) an efficient method for eliminating redundant equations in a system of O(mn) linear equations over O(n) variables. Although such a fast elimination method is not known for general systems of linear equations, we take advantage of the underlying pedigree graph structure and recent progress on low-stretch spanning trees.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.