650 research outputs found

    On the Dynamic Time Warping of Cyclic Sequences for Shape Retrieval

    Get PDF
    In the last years, in shape retrieval, methods based on Dynamic Time Warping and sequences where each point of the contour is represented by elements of several dimensions have had a significant presence. In this approach each point of the closed contour contains information with respect to the other ones, this global information is very discriminant. The current state-of-the-art shape retrieval is based on the analysis of these distances to learn better ones. These methods are robust to noise and invariant to transformations, but, they obtain the invariance to the starting point with a brute force cyclic alignment which has a high computational time. In this work, we present the Cyclic Dynamic Time Warping. It can obtain the cyclic alignment in O(n2 log n) time, where n is the size of both sequences. Experimental results show that our proposal is a better alternative than the brute force cyclic alignment and other heuristics for obtaining this invariance

    EM Training of Hidden Markov Models for Shape Recognition Using Cyclic Strings

    Get PDF
    Shape descriptions and the corresponding matching techniques must be robust to noise and invariant to transformations for their use in recognition tasks. Most transformations are relatively easy to handle when contours are represented by strings. However, starting point invariance is difficult to achieve. One interesting possibility is the use of cyclic strings, which are strings with no starting and final points. Here we present the use of Hidden Markov Models for modelling cyclic strings and their training using Expectation Maximization. Experimental results show that our proposal outperforms other methods in the literature

    A Heuristic Based on the Intrinsic Dimensionality for Reducing the Number of Cyclic DTW Comparisons in Shape Classification and Retrieval Using AESA

    Get PDF
    Cyclic Dynamic Time Warping (CDTW) is a good dissimilarity of shape descriptors of high dimensionality based on contours, but it is computationally expensive. For this reason, to perform recognition tasks, a method to reduce the number of comparisons and avoid an exhaustive search is convenient. The Approximate and Eliminate Search Algorithm (AESA) is a relevant indexing method because of its drastic reduction of comparisons, however, this algorithm requires a metric distance and that is not the case of CDTW. In this paper, we introduce a heuristic based on the intrinsic dimensionality that allows to use CDTW and AESA together in classification and retrieval tasks over these shape descriptors. Experimental results show that, for descriptors of high dimensionality, our proposal is optimal in practice and significantly outperforms an exhaustive search, which is the only alternative for them and CDTW in these tasks

    On hidden Markov models and cyclic strings for shape recognition

    Get PDF
    Shape descriptions and the corresponding matching techniques must be robust to noise and invariant to transformations for their use in recognition tasks. Most transformations are relatively easy to handle when contours are represented by strings. However, starting point invariance is difficult to achieve. One interesting possibility is the use of cyclic strings, which are strings that have no starting and final points. We propose new methodologies to use Hidden Markov Models to classify contours represented by cyclic strings. Experimental results show that our proposals outperform other methods in the literature

    Speeding up the cyclic edit distance using LAESA with early abandon

    Get PDF
    The cyclic edit distance between two strings is the minimum edit distance between one of this strings and every possible cyclic shift of the other. This can be useful, for example, in image analysis where strings describe the contour of shapes or in computational biology for classifying circular permuted proteins or circular DNA/RNA molecules. The cyclic edit distance can be computed in O(mnlog m) time, however, in real recognition tasks this is a high computational cost because of the size of databases. A method to reduce the number of comparisons and avoid an exhaustive search is convenient. In this work, we present a new algorithm based on a modification of LAESA (linear approximating and eliminating search algorithm) for applying pruning in the computation of distances. It is an efficient procedure for classification and retrieval of cyclic strings. Experimental results show that our proposal considerably outperforms LAESAWork partially supported by the Spanish Government (TIN2010-18958), and the Generalitat Valenciana (PROMETEOII/2014/062)

    Approximate Circular Pattern Matching

    Get PDF
    We investigate the complexity of approximate circular pattern matching (CPM, in short) under the Hamming and edit distance. Under each of these two basic metrics, we are given a length-n text T, a length-m pattern P, and a positive integer threshold k, and we are to report all starting positions (called occurrences) of fragments of T that are at distance at most k from some cyclic rotation of P. In the decision version of the problem, we are to check if there is any such occurrence. All previous results for approximate CPM were either average-case upper bounds or heuristics, with the exception of the work of Charalampopoulos et al. [CKP+, JCSS'21], who considered only the Hamming distance. For the reporting version of the approximate CPM problem, under the Hamming distance we improve upon the main algorithm of [CKP+, JCSS'21] from O(n+(n/m) k4) to O(n+(n/m) k3 log log k) time; for the edit distance, we give an O(nk2)-time algorithm. Notably, for the decision versions and wide parameter-ranges, we give algorithms whose complexities are almost identical to the state-of-the-art for standard (i.e., non-circular) approximate pattern matching: For the decision version of the approximate CPM problem under the Hamming distance, we obtain an O(n + (n/m) k2 log k/ log log k)-time algorithm, which works in O(n) time whenever k = O( p mlog log m/logm). In comparison, the fastest algorithm for the standard counterpart of the problem, by Chan et al. [CGKKP, STOC'20], runs in O(n) time only for k = O(√ m). We achieve this result via a reduction to a geometric problem by building on ideas from [CKP+, JCSS'21] and Charalampopoulos et al. [CKW, FOCS'20]. For the decision version of the approximate CPM problem under the edit distance, the O(nk log3 k) runtime of our algorithm near matches the O(nk) runtime of the Landau-Vishkin algorithm [LV, J. Algorithms'89] for approximate pattern matching under edit distance; the latter algorithm remains the fastest known for k = Ω(m2/5). As a stepping stone, we propose an O(nk log3 k)-time algorithm for solving the Longest Prefix k-Approximate Match problem, proposed by Landau et al. [LMS, SICOMP'98], for all k ∈ {1, , k}. Our algorithm is based on Tiskin's theory of seaweeds [Tiskin, Math. Comput. Sci.'08], with recent advancements (see Charalampopoulos et al. [CKW, FOCS'22]), and on exploiting the seaweeds' relation to Monge matrices. In contrast, we obtain a conditional lower bound that suggests a polynomial separation between approximate CPM under the Hamming distance over the binary alphabet and its non-circular counterpart. We also show that a strongly subquadratic-time algorithm for the decision version of approximate CPM under edit distance would refute the Strong Exponential Time Hypothesis

    Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String

    Get PDF
    Cyclic versions of covers and roots of a string are considered in this paper. A prefix V of a string S is a cyclic root of S if S is a concatenation of cyclic rotations of V. A prefix V of S is a cyclic cover of S if the occurrences of the cyclic rotations of V cover all positions of S. We present ?(n)-time algorithms computing all cyclic roots (using number-theoretic tools) and all cyclic covers (using tools related to seeds) of a length-n string over an integer alphabet. Our results improve upon ?(n log log n) and ?(n log n) time complexities of recent algorithms of Grossi et al. (WALCOM 2023) for the respective problems and provide novel approaches to the problems. As a by-product, we obtain an optimal data structure for Internal Circular Pattern Matching queries that generalize Internal Pattern Matching and Cyclic Equivalence queries of Kociumaka et al. (SODA 2015)