2,419 research outputs found
Combinatorics of unique maximal factorization families (UMFFs)
Suppose a set W of strings contains exactly one rotation (cyclic shift) of every primitive string on some alphabet ÎŁ. Then W is a circ-UMFF if and only if every word in ÎŁ+ has a unique maximal factorization over W. The classic circ-UMFF is the set of Lyndon words based on lexicographic ordering (1958). Duval (1983) designed a linear sequential Lyndon factorization algorithm; a corresponding PRAM parallel algorithm was described by J. Daykin, Iliopoulos and Smyth (1994). Daykin and Daykin defined new circ-UMFFs based on various methods for totally ordering sets of strings (2003), and further described the structure of all circ-UMFFs (2008). Here we prove new combinatorial results for circ-UMFFs, and in particular for the case of Lyndon words. We introduce Acrobat and Flight Deck circ-UMFFs, and describe some of our results in terms of dictionaries. Applications of circ-UMFFs pertain to structured methods for concatenating and factoring strings over ordered alphabets, and those of Lyndon words are wide ranging and multidisciplinary
String Comparison in -Order: New Lexicographic Properties & On-line Applications
-order is a global order on strings related to Unique Maximal
Factorization Families (UMFFs), which are themselves generalizations of Lyndon
words. -order has recently been proposed as an alternative to
lexicographical order in the computation of suffix arrays and in the
suffix-sorting induced by the Burrows-Wheeler transform. Efficient -ordering
of strings thus becomes a matter of considerable interest. In this paper we
present new and surprising results on -order in strings, then go on to
explore the algorithmic consequences
Enhanced string factoring from alphabet orderings
In this note we consider the concept of alphabet ordering in the context of
string factoring. We propose a greedy-type algorithm which produces Lyndon
factorizations with small numbers of factors along with a modification for
large numbers of factors. For the technique we introduce the Exponent Parikh
vector. Applications and research directions derived from circ-UMFFs are
discussed.Comment: 9 page
The Euler anomaly and scale factors in Liouville/Toda CFTs
The role played by the Euler anomaly in the dictionary relating sphere
partition functions of four dimensional theories of class and two
dimensional nonrational CFTs is clarified. On the two dimensional side, this
involves a careful treatment of scale factors in Liouville/Toda correlators.
Using ideas from tinkertoy constructions for Gaiotto duality, a framework is
proposed for evaluating these scale factors. The representation theory of Weyl
groups plays a critical role in this framework.Comment: 55 pages, 16 figures; v2:fixed referencing & typos ; v3: argument
about scale factors in Liouville/Toda now phrased in terms of stripped
correlators, leading to a sharper conjecture (earlier version had some
inaccurate statements). Presentation improved, typos fixed, refs added. I
thank the anonymous referee for comments. Version accepted for publication in
JHE
Operator Precedence Languages: Their Automata-Theoretic and Logic Characterization
Operator precedence languages were introduced half a century ago by Robert Floyd to support deterministic and efficient parsing of context-free languages. Recently, we renewed our interest in this class of languages thanks to a few distinguishing properties that make them attractive for exploiting various modern technologies. Precisely, their local parsability enables parallel and incremental parsing, whereas their closure properties make them amenable to automatic verification techniques, including model checking. In this paper we provide a fairly complete theory of this class of languages: we introduce a class of automata with the same recognizing power as the generative power of their grammars; we provide a characterization of their sentences in terms of monadic second-order logic as has been done in previous literature for more restricted language classes such as regular, parenthesis, and input-driven ones; we investigate preserved and lost properties when extending the language sentences from finite length to infinite length (-languages). As a result, we obtain a class of languages that enjoys many of the nice properties of regular languages (closure and decidability properties, logic characterization) but is considerably larger than other families---typically parenthesis and input-driven ones---with the same properties, covering “almost” all deterministic languages
- …