29,028 research outputs found
siEDM: an efficient string index and search algorithm for edit distance with moves
Although several self-indexes for highly repetitive text collections exist,
developing an index and search algorithm with editing operations remains a
challenge. Edit distance with moves (EDM) is a string-to-string distance
measure that includes substring moves in addition to ordinal editing operations
to turn one string into another. Although the problem of computing EDM is
intractable, it has a wide range of potential applications, especially in
approximate string retrieval. Despite the importance of computing EDM, there
has been no efficient method for indexing and searching large text collections
based on the EDM measure. We propose the first algorithm, named string index
for edit distance with moves (siEDM), for indexing and searching strings with
EDM. The siEDM algorithm builds an index structure by leveraging the idea
behind the edit sensitive parsing (ESP), an efficient algorithm enabling
approximately computing EDM with guarantees of upper and lower bounds for the
exact EDM. siEDM efficiently prunes the space for searching query strings by
the proposed method, which enables fast query searches with the same guarantee
as ESP. We experimentally tested the ability of siEDM to index and search
strings on benchmark datasets, and we showed siEDM's efficiency.Comment: 23 page
Efficient enumeration of solutions produced by closure operations
In this paper we address the problem of generating all elements obtained by
the saturation of an initial set by some operations. More precisely, we prove
that we can generate the closure of a boolean relation (a set of boolean
vectors) by polymorphisms with a polynomial delay. Therefore we can compute
with polynomial delay the closure of a family of sets by any set of "set
operations": union, intersection, symmetric difference, subsets, supersets
). To do so, we study the problem: for a set
of operations , decide whether an element belongs to the closure
by of a family of elements. In the boolean case, we prove that
is in P for any set of boolean operations
. When the input vectors are over a domain larger than two
elements, we prove that the generic enumeration method fails, since
is NP-hard for some . We also study the
problem of generating minimal or maximal elements of closures and prove that
some of them are related to well known enumeration problems such as the
enumeration of the circuits of a matroid or the enumeration of maximal
independent sets of a hypergraph. This article improves on previous works of
the same authors.Comment: 30 pages, 1 figure. Long version of the article arXiv:1509.05623 of
the same name which appeared in STACS 2016. Final version for DMTCS journa
Three Puzzles on Mathematics, Computation, and Games
In this lecture I will talk about three mathematical puzzles involving
mathematics and computation that have preoccupied me over the years. The first
puzzle is to understand the amazing success of the simplex algorithm for linear
programming. The second puzzle is about errors made when votes are counted
during elections. The third puzzle is: are quantum computers possible?Comment: ICM 2018 plenary lecture, Rio de Janeiro, 36 pages, 7 Figure
- …