15 research outputs found

    Duel and sweep algorithm for order-preserving pattern matching

    Full text link
    Given a text TT and a pattern PP over alphabet Σ\Sigma, the classic exact matching problem searches for all occurrences of pattern PP in text TT. Unlike exact matching problem, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their real values. In this paper, we propose an efficient algorithm for OPPM problem using the "duel-and-sweep" paradigm. Our algorithm runs in O(n+mlogm)O(n + m\log m) time in general and O(n+m)O(n + m) time under an assumption that the characters in a string can be sorted in linear time with respect to the string size. We also perform experiments and show that our algorithm is faster that KMP-based algorithm. Last, we introduce the two-dimensional order preserved pattern matching and give a duel and sweep algorithm that runs in O(n2)O(n^2) time for duel stage and O(n2m)O(n^2 m) time for sweeping time with O(m3)O(m^3) preprocessing time.Comment: 13 pages, 5 figure

    A Compact Index for Order-Preserving Pattern Matching

    Full text link
    Order-preserving pattern matching was introduced recently but it has already attracted much attention. Given a reference sequence and a pattern, we want to locate all substrings of the reference sequence whose elements have the same relative order as the pattern elements. For this problem we consider the offline version in which we build an index for the reference sequence so that subsequent searches can be completed very efficiently. We propose a space-efficient index that works well in practice despite its lack of good worst-case time bounds. Our solution is based on the new approach of decomposing the indexed sequence into an order component, containing ordering information, and a delta component, containing information on the absolute values. Experiments show that this approach is viable, faster than the available alternatives, and it is the first one offering simultaneously small space usage and fast retrieval.Comment: 16 pages. A preliminary version appeared in the Proc. IEEE Data Compression Conference, DCC 2017, Snowbird, UT, USA, 201

    Boxed Permutation Pattern Matching

    Get PDF
    Given permutations T and P of length n and m, respectively, the Permutation Pattern Matching problem asks to find all m-length subsequences of T that are order-isomorphic to P. This problem has a wide range of applications but is known to be NP-hard. In this paper, we study the special case, where the goal is to only find the boxed subsequences of T that are order-isomorphic to P. This problem was introduced by Bruner and Lackner who showed that it can be solved in O(n^3) time. Cho et al. [CPM 2015] gave an O(n^2m) time algorithm and improved it to O(n^2 log m). In this paper we present a solution that uses only O(n^2) time. In general, there are instances where the output size is Omega(n^2) and hence our bound is optimal. To achieve our results, we introduce several new ideas including a novel reduction to 2D offline dominance counting. Our algorithm is surprisingly simple and straightforward to implement

    A compact index for order-preserving pattern matching

    Get PDF
    Order-preserving pattern matching has been introduced recently, but it has already attracted much attention. Given a reference sequence and a pattern, we want to locate all substrings of the reference sequence whose elements have the same relative order as the pattern elements. For this problem, we consider the offline version in which we build an index for the reference sequence so that subsequent searches can be completed very efficiently. We propose a space-efficient index that works well in practice despite its lack of good worst-case time bounds. Our solution is based on the new approach of decomposing the indexed sequence into an order component, containing ordering information, and a \u3b4 component, containing information on the absolute values. Experiments show that this approach is viable, is faster than the available alternatives, and is the first one offering simultaneously small space usage and fast retrieval

    An Encoding for Order-Preserving Matching

    Get PDF
    Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of order-preserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an order-preserving match if the relative order of their characters is the same: e.g., (4, 1, 3, 2) and (10, 3, 7, 5) are an order-preserving match. We show how, given a string S[1..n] over an arbitrary alphabet of size sigma and a constant c >=1, we can build an O(n log log n)-bit encoding such that later, given a pattern P[1..m] with m >= log^c n, we can return the number of order-preserving occurrences of P in S in O(m) time. Within the same time bound we can also return the starting position of some order-preserving match for P in S (if such a match exists). We prove that our space bound is within a constant factor of optimal if log(sigma) = Omega(log log n); our query time is optimal if log(sigma) = Omega(log n). Our space bound contrasts with the Omega(n log n) bits needed in the worst case to store S itself, an index for order-preserving pattern matching with no restrictions on the pattern length, or an index for standard pattern matching even with restrictions on the pattern length. Moreover, we can build our encoding knowing only how each character compares to O(log^c n) neighbouring characters

    New Algorithms for δγ-Order Preserving Matching

    Get PDF
    Context: Order-preserving matching regards the relative order of strings. However, its application areas require more flexibility in the matching paradigm. We advance in this direction in this paper that extends our previous work [27]. Method: We define γ -order preserving matching as an approximate variant of order-preserving matching. We devise two solutions for it based on segment and Fenwick trees: segtreeBA and bitBA. Results: We experimentally show the efficiency of our algorithms compared to the ones presented in [26] (naiveA and updateBA). Also, we present applications of our approach in music retrieval and stock market analysis. Conclusions: Even though the worst-case time complexity of the proposed algorithms (namely, O(nm log m)) is higher than the Ѳ(nm)-time complexity of updateBA, their Ω (n log n) lower bound makes them more efficient in practice. On the other hand, we show that our approach is useful to identify similarity in music melodies and stock price trends through real application examples
    corecore