11 research outputs found

    On the maximal sum of exponents of runs in a string

    Get PDF
    A run is an inclusion maximal occurrence in a string (as a subinterval) of a repetition vv with a period pp such that 2p≤∣v∣2p \le |v|. The exponent of a run is defined as ∣v∣/p|v|/p and is ≥2\ge 2. We show new bounds on the maximal sum of exponents of runs in a string of length nn. Our upper bound of 4.1n4.1n is better than the best previously known proven bound of 5.6n5.6n by Crochemore & Ilie (2008). The lower bound of 2.035n2.035n, obtained using a family of binary words, contradicts the conjecture of Kolpakov & Kucherov (1999) that the maximal sum of exponents of runs in a string of length nn is smaller than 2n2nComment: 7 pages, 1 figur

    On the maximal number of cubic subwords in a string

    Full text link
    We investigate the problem of the maximum number of cubic subwords (of the form wwwwww) in a given word. We also consider square subwords (of the form wwww). The problem of the maximum number of squares in a word is not well understood. Several new results related to this problem are produced in the paper. We consider two simple problems related to the maximum number of subwords which are squares or which are highly repetitive; then we provide a nontrivial estimation for the number of cubes. We show that the maximum number of squares xxxx such that xx is not a primitive word (nonprimitive squares) in a word of length nn is exactly ⌊n2⌋−1\lfloor \frac{n}{2}\rfloor - 1, and the maximum number of subwords of the form xkx^k, for k≥3k\ge 3, is exactly n−2n-2. In particular, the maximum number of cubes in a word is not greater than n−2n-2 either. Using very technical properties of occurrences of cubes, we improve this bound significantly. We show that the maximum number of cubes in a word of length nn is between (1/2)n(1/2)n and (4/5)n(4/5)n. (In particular, we improve the lower bound from the conference version of the paper.)Comment: 14 page

    Generation of Rational Numbers By . . .

    No full text
    The problem whether sets of rational numbers of the form 0 ! m=p n1 1 : : : p nk k ! 1, where p 1 ; : : : ; p k are prime numbers, n i 0 for all i = 1; : : : ; k and k 2, are finitely generated by probabilistic contact ß-networks is considered. In particular, concrete finite subsets generating these sets are described and upper bounds for the complexity of generation of numbers from these sets by the subsets are obtained. In th is paper we study the prob lem of generation of rat iona l num bers by probab ilist ic contact ß-networks. A probab ilistic contact ß-network is a ß-network [1] such that each edge ff of the network (ca lled a con tact) is assoc iated with a Boo lean random variab le ae(ff) ca lled the conduct iv ity of the con tact ff. W e assum e that the conduct iv ities of the contacts of a probab ilist ic contact ß-network are independen t [2]. A contact ff is ca lled closed if ae(ff) = 1. The va lue Pfae(ff) = 1g is ca lled the probab ility of conduct iv ity of ff. ..

    Efficient algorithms for two extensions of LPF table: the power of suffix arrays

    No full text
    Su?x arrays provide a powerful data structure to solve several questions related to the structure of all the factors of a string. We show how they can be used to compute e?ciently two new tables storing di?erent types of previous factors (past segments) of a string. The concept of a longest previous factor is inherent to Ziv-Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. The longest previous reverse factor for a given position i is the longest factor starting at i, such that its reverse copy occurs before, while the longest previous non-overlapping factor is the longest factor v starting at i which has an exact copy occurring before. The previous copies of the factors are required to occur in the pre?x ending at position i -1. We design algorithms computing the table of longest previous reverse factors (LPrF table) and the table of longest previous nonoverlapping factors (LPnF table). The latter table is useful to computerepetitions while the former is a useful tool for extracting symmetries. These tables are computed, using two previously computed read-only arrays (SUF and LCP) composing the su?x array, in linear time on anyinteger alphabet. The tables have not been explicitly considered before, but they have several applications and they are natural extensions of the LPF table which has been studied thoroughly before. Our results improve on the previous ones in several ways. The running time of the computation no longer depends on the size of the alphabet, which drops a log factor. Moreover the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors. computing their primitive roots. Applications of runs, despite their importance, are underrepresented in existing literature (approximately one page in the paper of Kolpakov & Kucherov, 1999). In this paper we attempt to ?ll in this gap. We use Lyndon words and introduce the Lyndon structure of runs as a useful tool when computing powers. In problems related to periods we use some versions of the Manhattan skyline problem

    Extracting Powers and Periods in a String from its Runs Structure

    Get PDF
    A breakthrough in the field of text algorithms was the discovery of the fact that the maximal number of runs in a string of length n is O(n) and that they can all be computed in O(n) time. We study some applications of this result. New simpler O(n) time algorithms are presented for a few classical string problems: computing all distinct kth string powers for a given k, in particular squares for k = 2, and finding all local periods in a given string of length n. Additionally, we present an efficient algorithm for testing primitivity of factors of a string and computing their primitive roots. Applications of runs, despite their importance, are underrepresented in existing literature (approximately one page in the paper of Kolpakov & Kucherov, 1999). In this paper we attempt to fill in this gap. We use Lyndon words and introduce the Lyndon structure of runs as a useful tool when computing powers. In problems related to periods we use some versions of the Manhattan skyline problem
    corecore