705 research outputs found
Computing Covers under Substring Consistent Equivalence Relations
Covers are a kind of quasiperiodicity in strings. A string is a cover of
another string if any position of is inside some occurrence of in
. The shortest and longest cover arrays of have the lengths of the
shortest and longest covers of each prefix of , respectively. The literature
has proposed linear-time algorithms computing longest and shortest cover arrays
taking border arrays as input. An equivalence relation over strings
is called a substring consistent equivalence relation (SCER) iff
implies (1) and (2) for all . In this paper, we generalize the notion of covers for SCERs and prove
that existing algorithms to compute the shortest cover array and the longest
cover array of a string under the identity relation will work for any SCERs
taking the accordingly generalized border arrays.Comment: 16 page
Computing NP-Hard Repetitiveness Measures via MAX-SAT
Repetitiveness measures reveal profound characteristics of datasets, and give rise to compressed data structures and algorithms working in compressed space. Alas, the computation of some of these measures is NP-hard, and straight-forward computation is infeasible for datasets of even small sizes. Three such measures are the smallest size of a string attractor, the smallest size of a bidirectional macro scheme, and the smallest size of a straight-line program. While a vast variety of implementations for heuristically computing approximations exist, exact computation of these measures has received little to no attention. In this paper, we present MAX-SAT formulations that provide the first non-trivial implementations for exact computation of smallest string attractors, smallest bidirectional macro schemes, and smallest straight-line programs. Computational experiments show that our implementations work for texts of length up to a few hundred for straight-line programs and bidirectional macro schemes, and texts even over a million for string attractors
A Formal Framework for Linguistic Annotation
`Linguistic annotation' covers any descriptive or analytic notations applied
to raw language data. The basic data may be in the form of time functions --
audio, video and/or physiological recordings -- or it may be textual. The added
notations may include transcriptions of all sorts (from phonetic features to
discourse structures), part-of-speech and sense tagging, syntactic analysis,
`named entity' identification, co-reference annotation, and so on. While there
are several ongoing efforts to provide formats and tools for such annotations
and to publish annotated linguistic databases, the lack of widely accepted
standards is becoming a critical problem. Proposed standards, to the extent
they exist, have focussed on file formats. This paper focuses instead on the
logical structure of linguistic annotations. We survey a wide variety of
existing annotation formats and demonstrate a common conceptual core, the
annotation graph. This provides a formal framework for constructing,
maintaining and searching linguistic annotations, while remaining consistent
with many alternative data structures and file formats.Comment: 49 page
部分文字列一貫同値関係の下での文字列パターン照合問題のためのduel-and-sweepアルゴリズム
Tohoku University篠原歩課
Combinatorial generation via permutation languages. II. Lattice congruences
This paper deals with lattice congruences of the weak order on the symmetric
group, and initiates the investigation of the cover graphs of the corresponding
lattice quotients. These graphs also arise as the skeleta of the so-called
quotientopes, a family of polytopes recently introduced by Pilaud and Santos
[Bull. Lond. Math. Soc., 51:406-420, 2019], which generalize permutahedra,
associahedra, hypercubes and several other polytopes. We prove that all of
these graphs have a Hamilton path, which can be computed by a simple greedy
algorithm. This is an application of our framework for exhaustively generating
various classes of combinatorial objects by encoding them as permutations. We
also characterize which of these graphs are vertex-transitive or regular via
their arc diagrams, give corresponding precise and asymptotic counting results,
and we determine their minimum and maximum degrees. Moreover, we investigate
the relation between lattice congruences of the weak order and pattern-avoiding
permutations
Recommended from our members
New Applications of the Nearest-Neighbor Chain Algorithm
The nearest-neighbor chain algorithm was proposed in the eighties as a way to speed up certain hierarchical clustering algorithms. In the first part of the dissertation, we show that its application is not limited to clustering. We apply it to a variety of geometric and combinatorial problems. In each case, we show that the nearest-neighbor chain algorithm finds the same solution as a preexistent greedy algorithm, but often with an improved runtime. We obtain speedups over greedy algorithms for Euclidean TSP, Steiner TSP in planar graphs, straight skeletons, a geometric coverage problem, and three stable matching models. In the second part, we study the stable-matching Voronoi diagram, a type of plane partition which combines properties of stable matchings and Voronoi diagrams. We propose political redistricting as an application. We also show that it is impossible to compute this diagram in an algebraic model of computation, and give three algorithmic approaches to overcome this obstacle. One of them is based on the nearest-neighbor chain algorithm, linking the two parts together
- …