1 research outputs found

    Smooth qq-Gram, and Its Applications to Detection of Overlaps among Long, Error-Prone Sequencing Reads

    Full text link
    We propose smooth qq-gram, the first variant of qq-gram that captures qq-gram pair within a small edit distance. We apply smooth qq-gram to the problem of detecting overlapping pairs of error-prone reads produced by single molecule real time sequencing (SMRT), which is the first and most critical step of the de novo fragment assembly of SMRT reads. We have implemented and tested our algorithm on a set of real world benchmarks. Our empirical results demonstrated the significant superiority of our algorithm over the existing qq-gram based algorithms in accuracy