40 research outputs found
Novel Results on the Number of Runs of the Burrows-Wheeler-Transform
The Burrows-Wheeler-Transform (BWT), a reversible string transformation, is
one of the fundamental components of many current data structures in string
processing. It is central in data compression, as well as in efficient query
algorithms for sequence data, such as webpages, genomic and other biological
sequences, or indeed any textual data. The BWT lends itself well to compression
because its number of equal-letter-runs (usually referred to as ) is often
considerably lower than that of the original string; in particular, it is well
suited for strings with many repeated factors. In fact, much attention has been
paid to the parameter as measure of repetitiveness, especially to evaluate
the performance in terms of both space and time of compressed indexing data
structures.
In this paper, we investigate , the ratio of and of the number
of runs of the BWT of the reverse of . Kempa and Kociumaka [FOCS 2020] gave
the first non-trivial upper bound as , for any string
of length . However, nothing is known about the tightness of this upper
bound. We present infinite families of binary strings for which holds, thus giving the first non-trivial lower bound on
, the maximum over all strings of length .
Our results suggest that is not an ideal measure of the repetitiveness of
the string, since the number of repeated factors is invariant between the
string and its reverse. We believe that there is a more intricate relationship
between the number of runs of the BWT and the string's combinatorial
properties.Comment: 14 pages, 2 figue
Cyclic Complexity of Words
We introduce and study a complexity function on words called
\emph{cyclic complexity}, which counts the number of conjugacy classes of
factors of length of an infinite word We extend the well-known
Morse-Hedlund theorem to the setting of cyclic complexity by showing that a
word is ultimately periodic if and only if it has bounded cyclic complexity.
Unlike most complexity functions, cyclic complexity distinguishes between
Sturmian words of different slopes. We prove that if is a Sturmian word and
is a word having the same cyclic complexity of then up to renaming
letters, and have the same set of factors. In particular, is also
Sturmian of slope equal to that of Since for some
implies is periodic, it is natural to consider the quantity
We show that if is a Sturmian word,
then We prove however that this is
not a characterization of Sturmian words by exhibiting a restricted class of
Toeplitz words, including the period-doubling word, which also verify this same
condition on the limit infimum. In contrast we show that, for the Thue-Morse
word , Comment: To appear in Journal of Combinatorial Theory, Series
Recommended from our members
Around the Fibonacci Numeration System
Let 1, 2, 3, 5, 8, … denote the Fibonacci sequence beginning with 1 and 2, and then setting each subsequent number to the sum of the two previous ones. Every positive integer n can be expressed as a sum of distinct Fibonacci numbers in one or more ways. Setting R(n) to be the number of ways n can be written as a sum of distinct Fibonacci numbers, we exhibit certain regularity properties of R(n), one of which is connected to the Euler φ-function. In addition, using a theorem of Fine and Wilf, we give a formula for R(n) in terms of binomial coefficients modulo two
An Introductory Course on Constraint Logic Programming
The purpose of this document is to serve as the printed material for the seminar "An Introductory Course on Constraint Logic Programming". The intended audience of this seminar are industrial programmers with a degree in Computer Science but little previous experience with constraint programming. The seminar itself has been field tested, prior to the writing of this document, with a group of the application programmers of Esprit project P23182, "VOCAL", aimed at developing an application in scheduling of field maintenance tasks in the context of an electric utility company. The contents of this paper follow essentially the flow of the seminar slides. However, there are some differences. These differences stem from our perception from the experience of teaching the seminar, that the technical aspects are the ones which need more attention and clearer explanations in the written version. Thus, this document includes more examples than those in the slides, more exercises (and the solutions to them), as well as four additional programming projects, with which we hope the reader will obtain a clearer view of the process of development and tuning of programs using CLP. On the other hand, several parts of the seminar have been taken out: those related with the account of fields and applications in which C(L)P is useful, and the enumerations of C(L)P tools available. We feel that the slides are clear enough, and that for more information on available tools, the interested reader will find more up-to-date information by browsing the Web or asking the vendors directly. More details in this direction will actually boil down to summarizing a user manual, which is not the aim of this document
Fundamentals of Java Programming
This book was born from the desire of having an introductory Java programming textbook whose
contents can be covered in one semester. The book was written with two types of audience in mind:
those who intend to major in computer science and those who want to get a glimpse of computer
programming. The book does not cover graphical user interfaces or the materials that are taught in a
data structure course. The book very quickly surveys the Java Collection Framework and the generics
in the penultimate chapter. The book also covers the concepts of online and recursive algorithms in
the last chapter. The instructors who choose to use this textbook are free to skip these chapters if
there is no sufficient time. Except for the code examples that receive parameters from the command
line, the code examples can be compiled and run in a command-line environment as well as in IDEs.
To execute those code examples in an IDE, the user must follow the step of provide args before
execution. The code examples appearing in the book have very few comments, since the actions of
the code are explained in the prose. The code examples with extensive comments are available for the
publisher. There are PDF lecture slides accompanying the book. They are prepared using the Beamer
environment of LATEX. The source codes of the lecture slides may be available through the publisher
Theoretical and Practical Aspects Related to the Avoidability of Patterns in Words
This thesis concerns repetitive structures in words. More precisely, it contributes to studying appearance and absence of such repetitions in words. In the first and major part of this thesis, we study avoidability of unary patterns with permutations. The second part of this thesis deals with modeling and solving several avoidability problems as constraint satisfaction problems, using the framework of MiniZinc. Solving avoidability problems like the one mentioned in the past paragraph required, the construction, via a computer program, of a very long word that does not contain any word that matches a given pattern. This gave us the idea of using SAT solvers. Representing the problem-based SAT solvers seemed to be a standardised, and usually very optimised approach to formulate and solve the well-known avoidability problems like avoidability of formulas with reversal and avoidability of patterns in the abelian sense too. The final part is concerned with a variation on a classical avoidance problem from combinatorics on words. Considering the concatenation of i different factors of the word w, pexp_i(w) is the supremum of powers that can be constructed by concatenation of such factors, and RTi(k) is then the infimum of pexp_i(w). Again, by checking infinite ternary words that satisfy some properties, we calculate the value RT_i(3) for even and odd values of i