1 research outputs found
Smooth -Gram, and Its Applications to Detection of Overlaps among Long, Error-Prone Sequencing Reads
We propose smooth -gram, the first variant of -gram that captures
-gram pair within a small edit distance. We apply smooth -gram to the
problem of detecting overlapping pairs of error-prone reads produced by single
molecule real time sequencing (SMRT), which is the first and most critical step
of the de novo fragment assembly of SMRT reads. We have implemented and tested
our algorithm on a set of real world benchmarks. Our empirical results
demonstrated the significant superiority of our algorithm over the existing
-gram based algorithms in accuracy