29,028 research outputs found

    siEDM: an efficient string index and search algorithm for edit distance with moves

    Full text link
    Although several self-indexes for highly repetitive text collections exist, developing an index and search algorithm with editing operations remains a challenge. Edit distance with moves (EDM) is a string-to-string distance measure that includes substring moves in addition to ordinal editing operations to turn one string into another. Although the problem of computing EDM is intractable, it has a wide range of potential applications, especially in approximate string retrieval. Despite the importance of computing EDM, there has been no efficient method for indexing and searching large text collections based on the EDM measure. We propose the first algorithm, named string index for edit distance with moves (siEDM), for indexing and searching strings with EDM. The siEDM algorithm builds an index structure by leveraging the idea behind the edit sensitive parsing (ESP), an efficient algorithm enabling approximately computing EDM with guarantees of upper and lower bounds for the exact EDM. siEDM efficiently prunes the space for searching query strings by the proposed method, which enables fast query searches with the same guarantee as ESP. We experimentally tested the ability of siEDM to index and search strings on benchmark datasets, and we showed siEDM's efficiency.Comment: 23 page

    Efficient enumeration of solutions produced by closure operations

    Full text link
    In this paper we address the problem of generating all elements obtained by the saturation of an initial set by some operations. More precisely, we prove that we can generate the closure of a boolean relation (a set of boolean vectors) by polymorphisms with a polynomial delay. Therefore we can compute with polynomial delay the closure of a family of sets by any set of "set operations": union, intersection, symmetric difference, subsets, supersets …\dots). To do so, we study the MembershipFMembership_{\mathcal{F}} problem: for a set of operations F\mathcal{F}, decide whether an element belongs to the closure by F\mathcal{F} of a family of elements. In the boolean case, we prove that MembershipFMembership_{\mathcal{F}} is in P for any set of boolean operations F\mathcal{F}. When the input vectors are over a domain larger than two elements, we prove that the generic enumeration method fails, since MembershipFMembership_{\mathcal{F}} is NP-hard for some F\mathcal{F}. We also study the problem of generating minimal or maximal elements of closures and prove that some of them are related to well known enumeration problems such as the enumeration of the circuits of a matroid or the enumeration of maximal independent sets of a hypergraph. This article improves on previous works of the same authors.Comment: 30 pages, 1 figure. Long version of the article arXiv:1509.05623 of the same name which appeared in STACS 2016. Final version for DMTCS journa

    Three Puzzles on Mathematics, Computation, and Games

    Full text link
    In this lecture I will talk about three mathematical puzzles involving mathematics and computation that have preoccupied me over the years. The first puzzle is to understand the amazing success of the simplex algorithm for linear programming. The second puzzle is about errors made when votes are counted during elections. The third puzzle is: are quantum computers possible?Comment: ICM 2018 plenary lecture, Rio de Janeiro, 36 pages, 7 Figure
    • …
    corecore