research

Exact Asymptotic Results for a Model of Sequence Alignment

Abstract

Finding analytically the statistics of the longest common subsequence (LCS) of a pair of random sequences drawn from c alphabets is a challenging problem in computational evolutionary biology. We present exact asymptotic results for the distribution of the LCS in a simpler, yet nontrivial, variant of the original model called the Bernoulli matching (BM) model which reduces to the original model in the large c limit. We show that in the BM model, for all c, the distribution of the asymptotic length of the LCS, suitably scaled, is identical to the Tracy-Widom distribution of the largest eigenvalue of a random matrix whose entries are drawn from a Gaussian unitary ensemble. In particular, in the large c limit, this provides an exact expression for the asymptotic length distribution in the original LCS problem.Comment: 4 pages Revtex, 2 .eps figures include

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 02/01/2020