Efficient whole genome haplotyping and single molecule phasing with barcode-linked reads

Abstract

The future of human genomics is one that seeks to resolve the entirety of genetic variation through sequencing. The prospect of utilizing genomics for medical purposes require cost-efficient and accurate base calling, long-range haplotyping capability, and reliable calling of structural variants. Short-read sequencing has lead the development towards such a future but has struggled to meet the latter two of these needs. To address this limitation, we developed a technology that preserves the molecular origin of short sequencing reads, with an insignificant increase to sequencing costs. We demonstrate a library preparation method which enables whole genome haplotyping, long-range phasing of single DNA molecules, and de novo genome assembly through barcode-linked reads (BLR). Millions of random barcodes are used to reconstruct megabase-scale phase blocks and call structural variants. We also highlight the versatility of our technology by generating libraries from different organisms using picograms to nanograms of input material.QC 20180919</p

    Similar works