Adaptive seeds tame genomic sequence comparison

Kielbasa, S.; Wan, R.; Sato, K.; Horton, P.; Frith, M.

research article

Adaptive seeds tame genomic sequence comparison

Authors: S. Kielbasa
R. Wan
K. Sato
P. Horton
M. Frith
Publication date: 2011
Publisher
Doi

Abstract

The main way of analyzing biological sequences is by comparing and aligning them to each other. It remains difficult, however, to compare modern multi-billionbase DNA data sets. The difficulty is caused by the nonuniform (oligo)nucleotide composition of these sequences, rather than their size per se. To solve this problem, we modified the standard seed-and-extend approach (e.g., BLAST) to use adaptive seeds. Adaptive seeds are matches that are chosen based on their rareness, instead of using fixed-length matches. This method guarantees that the number of matches, and thus the running time, increases linearly, instead of quadratically, with sequence length. LAST, our open source implementation of adaptive seeds, enables fast and sensitive comparison of large sequences with arbitrarily nonuniform composition

info:eu-repo/semantics/article

Similar works

Full text

MPG.PuRe

oai:escidoc.org:escidoc:158344...

Last time updated on 23/08/2016

This paper was published in MPG.PuRe.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.