4 research outputs found
Optimal Haplotype Assembly from High-Throughput Mate-Pair Reads
Humans have pairs of homologous chromosomes. The homologous pairs are
almost identical pairs of chromosomes. For the most part, differences in
homologous chromosome occur at certain documented positions called single
nucleotide polymorphisms (SNPs). A haplotype of an individual is the pair of
sequences of SNPs on the two homologous chromosomes. In this paper, we study
the problem of inferring haplotypes of individuals from mate-pair reads of
their genome. We give a simple formula for the coverage needed for haplotype
assembly, under a generative model. The analysis here leverages connections of
this problem with decoding convolutional codes.Comment: 10 pages, 4 figures, Submitted to ISIT 201