2,369 research outputs found
Designing optimal- and fast-on-average pattern matching algorithms
Given a pattern and a text , the speed of a pattern matching algorithm
over with regard to , is the ratio of the length of to the number of
text accesses performed to search into . We first propose a general
method for computing the limit of the expected speed of pattern matching
algorithms, with regard to , over iid texts. Next, we show how to determine
the greatest speed which can be achieved among a large class of algorithms,
altogether with an algorithm running this speed. Since the complexity of this
determination make it impossible to deal with patterns of length greater than
4, we propose a polynomial heuristic. Finally, our approaches are compared with
9 pre-existing pattern matching algorithms from both a theoretical and a
practical point of view, i.e. both in terms of limit expected speed on iid
texts, and in terms of observed average speed on real data. In all cases, the
pre-existing algorithms are outperformed
Pattern Matching Algorithms
Import 23/08/2017Cílem této bakalářské práce je implementace knihovny pro vyhledávání v textech. Knihovna bude umožňovat vyhledávání uživatelem určeného vzoru s určitým počtem chyb v textu založené na deterministických a nedeterministických konečných automatech. Pro přibližné porovnávání vzorů bude uživateli umožněn výběr mezi Hammingovou a Levenshteinovou vzdáleností. V první části se práce zabývá rozborem teorie týkající se využití konečných automatů pro vyhledávání v textu pomocí přibližného porovnávání vzorů. Druhá část se zabývá implementací knihoven. Třetí část se zabývá experimenty s naimplementovanými knihovnami. Závěr shrnuje výhody a nevýhody tohoto přístupu k vyhledávání v textech.The aim of this bachelor thesis is implementation of library for approximate pattern matching. This library will allow seeking of user specified pattern with specified number of maximum mistakes in text based on deterministic and nondeterministic finite automata. User will be able to choose between Hamming distance and Levenshtein distance. The first part describes use of finite automata for approximate pattern matching. The second part describes implementation of libraries. The third part focuses on experiments with implemented libraries. The conclusion of this thesis summarizes advantages and disadvantages of this approach to approximate pattern matching.460 - Katedra informatikyvýborn
Recommended from our members
Very low bit-rate video coding focusing on moving regions using three-tier arbitrary-shaped pattern selection algorithm
Very low bit-rate video coding using patterns to represent moving regions in macroblocks exhibits good potential for improved coding efficiency. Recently an Arbitrary Shaped Pattern Selection (ASPS) algorithm and its Extended version(EASPS) were presented, that used a dynamically extracted set of patterns, of the two different sizes, based on actual video content. These algorithms, like other pattern matching algorithms failed to capture a large number of active-region macroblocks (RMB) especially when the object moving regions is relatively larger in a video sequence. As the size of the moving object may vary, superior coding performance is achievable by using dynamically extracted patterns of a larger size. This paper, proposes a three-tier Arbitrary Shaped Pattern Selection (ASPS-3) algorithm that uses three different pattern sizes for very low bit ate coding. Experimental results show that ASPS-3 exhibits better performance compared with other pattern matching algorithms, including the low-bit rate video coding standard H.263
Analysis of two-dimensional approximate pattern matching algorithms
AbstractWe present a new and more rigorous analysis of the two algorithms for two-dimensional approximate pattern matching due to Kärkkäinen and Ukkonen. We also present modifications of these algorithms that use less space while keeping the same expected time
A taxonomy of sublinear multiple keyword pattern matching algorithms
AbstractThis article presents a taxonomy of sublinear keyword pattern matching algorithms related to the Boyer-Moore algorithm [3] and the Commentz-Walter algorithm [5, 6]. The taxonomy includes, amongst others, the multiple keyword generalization of the single keyword Boyer-Moore algorithm and an algorithm by Fan and Su [9, 10]. The corresponding precomputation algorithms are presented as well. The taxonomy is based on the idea of ordering algorithms according to their essential problem and algorithm details, and deriving all algorithms from a common starting point by successively adding these details in a correctness preserving way. This way of presentation not only provides a complete correctness argument of each algorithm, but also makes very clear what algorithms have in common (the details of their nearest common ancestor) and where they differ (the details added after their nearest common ancestor). Introduction of the notion of safe shift distances proves to be essential in the derivation and classification of the algorithms. Moreover, the article provides a common derivation for and a uniform presentation of the precomputation algorithms, not yet found in the literature
- …