477 research outputs found

    RLE Edit Distance in Near Optimal Time

    Get PDF
    We show that the edit distance between two run-length encoded strings of compressed lengths m and n respectively, can be computed in O(mn log(mn)) time. This improves the previous record by a factor of O(n/log(mn)). The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively closes a line of algorithmic research first started in 1993

    Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings

    Get PDF

    Avoiding Ambiguity and Assessing Uniqueness in Minisatellite Alignment

    Get PDF
    Several algorithms have been suggested for minisatellite alignment. Their time complexity is high -- close to O(n^3) -- due to the necessary reconstruction of duplication histories. We investigate the uniqueness of optimal alignments computed under the common single-copy duplication model. To this extent, it is necessary to avoid ambiguity in the algorithm employed. We re-code the ARLEM algorithm in the form of a grammar, and apply a disambiguation technique which uses a mapping to a canonical representation of minisatellite alignments. Having arrived at a non-ambiguous algorithm this way, we demonstrate that the underlying model -- independent of the algorithm -- gives rise to an exorbitant number of different, co-optimal alignments when applied to real-world data. We conclude that alignment-free methods should be considered for minisatellite comparison

    Almost Linear Time Computation of Maximal Repetitions in Run Length Encoded Strings

    Get PDF
    We consider the problem of computing all maximal repetitions contained in a string that is given in run-length encoding. Given a run-length encoding of a string, we show that the maximum number of maximal repetitions contained in the string is at most m+k-1, where m is the size of the run-length encoding, and k is the number of run-length factors whose exponent is at least 2. We also show an algorithm for computing all maximal repetitions in O(m alpha(m)) time and O(m) space, where alpha denotes the inverse Ackermann function

    Compressed bitmap indexes: beyond unions and intersections

    Get PDF
    Compressed bitmap indexes are used to speed up simple aggregate queries in databases. Indeed, set operations like intersections, unions and complements can be represented as logical operations (AND,OR,NOT) that are ideally suited for bitmaps. However, it is less obvious how to apply bitmaps to more advanced queries. For example, we might seek products in a store that meet some, but maybe not all, criteria. Such threshold queries generalize intersections and unions; they are often used in information-retrieval and data-mining applications. We introduce new algorithms that are sometimes three orders of magnitude faster than a naive approach. Our work shows that bitmap indexes are more broadly applicable than is commonly believed

    Discovering Interesting Behaviours in Complex Systems

    Get PDF
    • …
    corecore