96 research outputs found
The covers of a circular Fibonacci string
Fibonacci strings turn out to constitute worst cases for a number of computer algorithms which find generic patterns in strings. Examples of such patterns are repetitions, Abelian squares, and "covers". In particular, we characterize in this paper the covers of a circular Fibonacci string C(F k ) and show that they are \Theta(jF k j 2 ) in number. We show also that, by making use of an appropriate encoding, these covers can be reported in \Theta(jF k j) time. By contrast, the fastest known algorithm for computing the covers of an arbitrary circular string of length n requires time O(n log n)
Faster algorithms for computing maximal multirepeats in multiple sequences
A repeat in a string is a substring that occurs more than once. A repeat is extendible if every occurrence of the repeat has an identical letter either on the left or on the right; otherwise, it is maximal. A multirepeat is a repeat that occurs at least mmin times (mmin _ 2) in each of at least q (greater than or equal to) 1 strings in a given set of strings. In this paper, we describe a family of efficient algorithms based on suffix arrays to compute maximal multirepeats under various constraints. Our algorithms are faster, more flexible and much more space-efficient than algorithms recently proposed for this problem. The results extend recent work by two of the authors computing all maximal repeats in a single string
A characterization of the squares in a Fibonacci string
A (finite) Fibonacci stringFn is defined as follows: F0 = b, F1 = a; for every integer n â©Ÿ 2, Fn = Fn â 1Fn â 2. For n â©Ÿ 1, the length of Fn is denoted by . The infinite Fibonacci stringF is the string which contains every Fn, n â©Ÿ 1, as a prefix. Apart from their general theoretical importance, Fibonacci strings are often cited as worst-case examples for algorithms which compute all the repetitions or all the âAbelian squaresâ in a given string. In this paper we provide a characterization of all the squares in F, hence in every prefix Fn; this characterization naturally gives rise to a algorithm which specifies all the squares of Fn in an appropriate encoding. This encoding is made possible by the fact that the squares of Fn occur consecutively, in ârunsâ, the number of which is . By contrast, the known general algorithms for the computation of the repetitions in an arbitrary string require time (and produce outputs) when applied to a Fibonacci string Fn
A linear algorithm for computing all the squares of a Fibonacci string
A (finite) Fibonacci string is defined as follows: , ; for every integer , . For , the length of is denoted by , while it is convenient to define . The infinite Fibonacci string is the string which contains every , , as a prefix. Apart from their general theoretical importance, Fibonacci strings are often cited as worst case examples for algorithms which compute all the repetitions or all the ``Abelian squares'' in a given string. In this paper we provide a characterization of all the squares in , hence in every prefix ; this characterization naturally gives rise to a algorithm which specifies all the squares of in an appropriate encoding. This encoding is made possible by the fact that the squares of occur consecutively, in ``runs'', the number of which is . By contrast, the known general algorithms for the computation of the repetitions in an arbitrary string require time (and produce outputs) when applied to a Fibonacci string
Song classifications for dancing
A fundamental problem in music is to classify songs according to their rhythm. A rhythm is represented by a sequence of Quick (Q) and Slow (S) symbols, which correspond to the (relative) duration of notes, such that S=QQ. In this paper we present a linear algorithm for locating the maximum-length substring of a music text t that can be covered by a given rhythm r. An efficient algorithm to solve this problem, can then be used to find which rhythm, from a given set of such rhythms, covers the largest part of the music sequence under question, and thus best describes that sequence
Computing the minimum k-Cover of a string
We study the minimum k-cover problem. For a given string x of length n and an integer k, the minimum k-cover is the minimum set of k-substrings that covers x. We show that the on-line algorithm that has been proposed by Iliopoulos and Smyth [IS92] is not correct. We prove that the problem is in fact NP-hard. Furthermore, we propose two greedy algorithms that are implemented and tested on different kind of data
Conserved IKAROS-regulated genes associated with B-progenitor acute lymphoblastic leukemia outcome
Genetic alterations disrupting the transcription factor IKZF1 (encoding IKAROS) are associated with poor outcome in B lineage acute lymphoblastic leukemia (B-ALL) and occur in >70% of the high-risk BCR-ABL1+ (Ph+) and Ph-like disease subtypes. To examine IKAROS function in this context, we have developed novel mouse models allowing reversible RNAi-based control of Ikaros expression in established B-ALL in vivo. Notably, leukemias driven by combined BCR-ABL1 expression and Ikaros suppression rapidly regress when endogenous Ikaros is restored, causing sustained disease remission or ablation. Comparison of transcriptional profiles accompanying dynamic Ikaros perturbation in murine B-ALL in vivo with two independent human B-ALL cohorts identified nine evolutionarily conserved IKAROS-repressed genes. Notably, high expression of six of these genes is associated with inferior event-free survival in both patient cohorts. Among them are EMP1, which was recently implicated in B-ALL proliferation and prednisolone resistance, and the novel target CTNND1, encoding P120-catenin. We demonstrate that elevated Ctnnd1 expression contributes to maintenance of murine B-ALL cells with compromised Ikaros function. These results suggest that IKZF1 alterations in B-ALL leads to induction of multiple genes associated with proliferation and treatment resistance, identifying potential new therapeutic targets for high-risk disease
A fast average case algorithm for lyndon decomposition
A simple algorithm, called LD, is described for computing the Lyndon decomposition of a word of length. Although LD requires time 0(n log n) in the worst case, it is shown to require only Âź(rc) worst-case time for words which are â1-decomposableâ, and â(n) average-case time for words whose length is small with respect to alphabet size. The main interest in LD resides in its application to the problem of computing the canonical form of a circular word. For this problem, LD is shown to execute significantly faster than other known algorithms on important classes of words. Further, experiment suggests that, when applied to arbitrary words, LD on average outperforms the other known canonization algorithms in terms of two measures: number of tests on letters and execution time
- âŠ