Search CORE

9 research outputs found

A Bloom filter based semi-index on $q$ -grams

Author: Grabowski Szymon
Raniszewski Marcin
Susik Robert
Publication venue
Publication date: 10/07/2015
Field of study

We present a simple

q

-gram based semi-index, which allows to look for a pattern typically only in a small fraction of text blocks. Several space-time tradeoffs are presented. Experiments on Pizza & Chili datasets show that our solution is up to three orders of magnitude faster than the Claude et al. \cite{CNPSTjda10} semi-index at a comparable space usage

arXiv.org e-Print Archive

ZASTOSOWANIE ALGORYTMU WYSZUKIWANIA WIELU WZORCÓW OPARTEGO O TECHNIKĘ Q-GRAMÓW DO WYSZUKIWANIA PRZYBLIŻONEGO

Author: Susik Robert
Publication venue: 'Index Copernicus'
Publication date: 01/01/2017
Field of study

We consider the application of multiple pattern matching (Multi AOSO on q-Grams) algorithm for approximate pattern matching. We propose the on-line approach which translates the problem from approximate pattern matching into a multiple pattern one (called partitioning into exact search). Presented solution allows relatively fast search multiple patterns in text with given k-differences(or mismatches). This paper presents comparison of solution based on MAG algorithm, and [4]. Experiments on DNA, English, Proteins and XML texts with up to k errors show that the new proposed algorithm achieves relatively good results in practical use.Rozważamy zastosowanie algorytmu wyszukiwania wielu wzorców (Multi AOSO on q-Grams) do wyszukiwania przybliżonego. Proponujemy rozwiązanie on-line, upraszczające problem wyszukiwania przybliżonego do wyszukiwania wielu wzorców. Zaprezentowane rozwiązanie umożliwia relatywnie szybko wyszukiwać wiele wzorców dla odległości Levenshteina (lub Hamminga) z ograniczeniem do k. W artykule porównane jest rozwiązanie oparte na algorytmie MAG oraz [4]. Badania eksperymentalne przeprowadzone na zbiorach DNA, English, Proteins and XML z różnymi wartościami k wykazały, że zaproponowany algorytm osiąga relatywnie dobre wyniki w praktycznym zastosowaniu

Biblioteka Nauki - repozytorium artykuÅÃ³w

Crossref

Lublin University of Technology Journals

Circular pattern matching with k mismatches

Author: A Amir
C Barton
C Barton
C Barton
C Hazay
CS Iliopoulos
GM Landau
Karl Bringmann
Kimmo Fredriksson
KR Abrahamson
LAK Ayad
LAK Ayad
M Crochemore
M Ružić
MAR Azim
ML Fredman
P Bille
P Gawrychowski
P Gawrychowski
R Grossi
Raphael Clifford
T Hirvola
T Kociumaka
V Palazón-González
V Palazón-González
V Palazón-González
WI Chang
Z Galil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/07/2019
Field of study

The k-mismatch problem consists in computing the Hamming distance between a pattern P of length m and every length-m substring of a text T of length n, if this distance is no more than k. In many real-world applications, any cyclic shift of P is a relevant pattern, and thus one is interested in computing the minimal distance of every length-m substring of T and any cyclic shift of P. This is the circular pattern m

arXiv.org e-Print Archive

Crossref

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Circular pattern matching with k mismatches

Author: Charalampopoulos P. (Panagiotis)
Kociumaka T. (Tomasz)
Pissis S. (Solon)
Radoszewski J. (Jakub)
Rytter W. (Wojciech)
Straszyński J. (Juliusz)
Waleń T. (Tomasz)
Zuba W. (Wiktor)
Publication venue: 'Elsevier BV'
Publication date: 01/02/2021
Field of study

We consider the circular pattern matching with k mismatches (k-CPM) problem in which one is to compute the minimal Hamming distance of every length-m substring of T and any cyclic rotation of P, if this distance is no more than k. It is a variation of the well-studied k-mismatch problem. A multitude of papers has been devoted

VU Research Portal

CWI's Institutional Repository

Practical algorithms for biological sequence analysis:methods and applications

Author: Retha Ahmad
Publication venue
Publication date: 01/06/2019
Field of study

King's Research Portal

Average-optimal single and multiple approximate string matching

Author: Gonzalo Navarro
Kimmo Fredriksson
Publication venue
Publication date: 06/02/2008
Field of study

Abstract. We present a new algorithm for multiple approximate string matching. It is based on reading backwards enough ℓ-grams from text windows so as to prove that no occurrence can contain the part of the window read, and then shifting the window. Three variants of the algorithm are presented, which give different tradeoffs between how much they work in the window and how much they shift it. We show analytically that two of our algorithms are optimal on average. Compared to the first average-optimal multipattern approximate string matching algorithm [Fredriksson and Navarro, CPM 2003], the new algorithms are much faster and are optimal up to difference ratios of 1/2, contrary to the maximum of 1/3 that could be reached in previous work. This is also a contribution to the area of single-pattern approximate string matching, as the only average-optimal algorithm [Chang and Marr, CPM 1994] also reached a difference ratio of 1/3. We show experimentally that our algorithms are very competitive, displacing the long-standing best algorithm

CiteSeerX

Average-optimal single and multiple approximate string matching

Author: Baeza-Yates R.
Baeza-Yates R. A.
Chang W.
Fredriksson K.
Fredriksson K.
Fredriksson K.
Gonzalo Navarro
Grossi R.
Horspool R.
Hyyrö H.
Hyyrö H.
Kimmo Fredriksson
Kumar S.
Lopresti D.
Muth R.
Navarro G.
Navarro G.
Paul W.
Sellers P.
Sutinen E.
Ukkonen E.
Yao A. C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref