11 research outputs found

    String Matching and 1d Lattice Gases

    Full text link
    We calculate the probability distributions for the number of occurrences nn of a given ll letter word in a random string of kk letters. Analytical expressions for the distribution are known for the asymptotic regimes (i) krl1k \gg r^l \gg 1 (Gaussian) and k,lk,l \to \infty such that k/rlk/r^l is finite (Compound Poisson). However, it is known that these distributions do now work well in the intermediate regime krl1k \gtrsim r^l \gtrsim 1. We show that the problem of calculating the string matching probability can be cast into a determining the configurational partition function of a 1d lattice gas with interacting particles so that the matching probability becomes the grand-partition sum of the lattice gas, with the number of particles corresponding to the number of matches. We perform a virial expansion of the effective equation of state and obtain the probability distribution. Our result reproduces the behavior of the distribution in all regimes. We are also able to show analytically how the limiting distributions arise. Our analysis builds on the fact that the effective interactions between the particles consist of a relatively strong core of size ll, the word length, followed by a weak, exponentially decaying tail. We find that the asymptotic regimes correspond to the case where the tail of the interactions can be neglected, while in the intermediate regime they need to be kept in the analysis. Our results are readily generalized to the case where the random strings are generated by more complicated stochastic processes such as a non-uniform letter probability distribution or Markov chains. We show that in these cases the tails of the effective interactions can be made even more dominant rendering thus the asymptotic approximations less accurate in such a regime.Comment: 44 pages and 8 figures. Major revision of previous version. The lattice gas analogy has been worked out in full, including virial expansion and equation of state. This constitutes the main part of the paper now. Connections with existing work is made and references should be up to date now. To be submitted for publicatio

    ON TOUCHARD POLYNOMIALS

    Get PDF
    AbstractJ. Touchard in his work on the cycles of permutations generalized the Bell polynomials in order to study some problems of enumeration of the permutations when the cycles possess certain properties.In the present paper (considering Touchards's generalization) we introduce and study a class of related polynomials. An exponential generating function, recurrence relations and connections with other well-known polynomials are obtained. In special cases, relations with Stirling number of the first and second kind, as well as with other numbers recently studied are derived. Finally, a combinatorial interpretation is discussed

    A limit theorem on the number of overlapping appearances of a pattern in a sequence of independent trials

    No full text
    A sequence of independent experiments is performed, each one producing a letter from a given alphabet. We study the number of overlapping appearances of a given pattern of letters and we prove that, under quite general conditions, the number of overlapping appearances of long patterns is approximately distributed according to a Pólya-Aeppli distribution. © 1988 Springer-Verlag

    Relayed Communication via Parallel Redundancy

    No full text
    A special communication system with moving relay stations was introduced by Chiang & Chiang. The relay stations are all moving at the same speed from an origin toward a destination. If k consecutive stations fail before the first station reaches the destination, all stations ahead of the k failed stations as well as the k failed stations are lost. The objective is to have one station reach the destination without k or more consecutive failed stations between the destination and the origin. In this paper a different communication system is described where relay stations are moving in groups of k stations. If all k stations of a group fail, then all groups of k stations ahead of the k failed stations as well as the k failed stations are lost. The objective is to have one station reach the destination with all groups having at least one station in good condition. We show that this second communication system requires fewer stations on the average. Also we examine the probability distribution of the numbers of station required in the first communication system. ©1989 IEE
    corecore