Search CORE

8 research outputs found

On Maximal Unbordered Factors

Author: A Ehrenfeucht
D Moore
F Franĕk
J-P Duval
J-P Duval
J-P Duval
L Ilie
P Gawrychowski
P Nielsen
R Assous
S Holub
T Kociumaka
Publication venue
Publication date: 28/04/2015
Field of study

Given a string

S

of length

n

, its maximal unbordered factor is the longest factor which does not have a border. In this work we investigate the relationship between

n

and the length of the maximal unbordered factor of

S

. We prove that for the alphabet of size

\sigma \ge 5

the expected length of the maximal unbordered factor of a string of length~

n

is at least

0.99 n

(for sufficiently large values of

n

). As an application of this result, we propose a new algorithm for computing the maximal unbordered factor of a string.Comment: Accepted to the 26th Annual Symposium on Combinatorial Pattern Matching (CPM 2015

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot

HAL-Ecole des Ponts ParisTech

Explore Bristol Research

HAL - UPEC / UPEM

A Frame Work for Parallel String Matching- A Computational Approach with Omega Model

Author: K Butchi Raju
Publication venue: Global Journals Inc. (US)
Publication date: 15/07/2013
Field of study

Now a day2019;s parallel string matching problem is attracted by so many researchers because of the importance in information retrieval systems. While it is very easily stated and many of the simple algorithms perform very well in practice, numerous works have been published on the subject and research is still very active. In this paper we propose a omega parallel computing model for parallel string matching. Experimental results show that, on a multi-processor system, the omega model implementation of the proposed parallel string matching algorithm can reduce string matching time by more than 40%

Global Journal of Computer Science and Technology (GJCST)

MissMax: Alignment-free sequence comparison with mismatches through filtering and heuristics

Author: PIZZI CINZIA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

BACKGROUND: Measuring sequence similarity is central for many problems in bioinformatics. In several contexts alignment-free techniques based on exact occurrences of substrings are faster, but also less accurate, than alignment-based approaches. Recently, several studies attempted to bridge the accuracy gap with the introduction of approximate matches in the definition of composition-based similarity measures. RESULTS: In this work we present MissMax, an exact algorithm for the computation of the longest common substring with mismatches between each suffix of a sequence x and a sequence y. This collection of statistics is useful for the computation of two similarity measures: the longest and the average common substring with k mismatches. As a further contribution we provide a “relaxed” version of MissMax that does not guarantee the exact solution, but it is faster in practice and still very precise

Crossref

Springer - Publisher Connector

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

The Longest Common Extension Problem Revisited and Applications to Approximate String Searching ∗

Author: Gonzalo Navarro
Liviu Tinta
Lucian Ilie
Publication venue
Publication date: 01/01/2009
Field of study

The Longest Common Extension (LCE) problem considers a string s and computes, for each pair (i,j), the longest substring of s that starts at both i and j. It appears as a subproblem in many fundamental string problems and can be solved by linear-time preprocessing of the string that allows (worst-case) constant-time computation for each pair. The two known approaches use powerful algorithms: either constant-time computation of the Lowest Common Ancestor in trees or constant-time computation of Range Minimum Queries in arrays. We show here that, from practical point of view, such complicated approaches are not needed. We give two very simple algorithms for this problem that require no preprocessing. The first is 5 times faster than the best previous algorithms on the average whereas the second is faster on virtually all inputs. As an application, we modify the Landau-Vishkin algorithm for approximate matching to use our simplest LCE algorithm. The obtained algorithm is 13 to 20 times faster than the original. We compare it with the more widely used Ukkonen’s cutoff algorithm and show that it behaves better for a significant range of error thresholds

CiteSeerX

Elsevier - Publisher Connector

The longest common extension problem revisited and applications to approximate string searching

Author: Bender
Berkman
de Bruijn
de C. Miranda
Fischer
Gonzalo Navarro
Gusfield
Gusfield
Harel
Ilie
Kasai
Kim
Ko
Kärkkäinen
Landau
Landau
Landau
Liviu Tinta
Lucian Ilie
Main
Manber
Manzini
Myers
Navarro
Nong
Schieber
Ukkonen
Ukkonen
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref