Search CORE

5 research outputs found

Covering Problems for Partial Words and for Indeterminate Strings

Author: A Apostolico
A Apostolico
A Kalai
CS Iliopoulos
CS Iliopoulos
D Breslauer
D Lokshtanov
D Moore
J Holub
KR Abrahamson
MF Bari
MJ Fischer
P Antoniou
R Impagliazzo
R Impagliazzo
T Kociumaka
WF Smyth
Y Li
Publication venue
Publication date: 01/01/2014
Field of study

We consider the problem of computing a shortest solid cover of an indeterminate string. An indeterminate string may contain non-solid symbols, each of which specifies a subset of the alphabet that could be present at the corresponding position. We also consider covering partial words, which are a special case of indeterminate strings where each non-solid symbol is a don't care symbol. We prove that indeterminate string covering problem and partial word covering problem are NP-complete for binary alphabet and show that both problems are fixed-parameter tractable with respect to

k

, the number of non-solid symbols. For the indeterminate string covering problem we obtain a

2^{O(k \log k)} + n k^{O(1)}

-time algorithm. For the partial word covering problem we obtain a

2^{O(\sqrt{k}\log k)} + nk^{O(1)}

-time algorithm. We prove that, unless the Exponential Time Hypothesis is false, no

2^{o(\sqrt{k})} n^{O(1)}

-time solution exists for either problem, which shows that our algorithm for this case is close to optimal. We also present an algorithm for both problems which is feasible in practice.Comment: full version (simplified and corrected); preliminary version appeared at ISAAC 2014; 14 pages, 4 figure

arXiv.org e-Print Archive

Crossref

King's Research Portal

Covering problems for partial words and for indeterminate strings

Author: Crochemore Maxime
Iliopoulos Costas S.
Kociumaka Tomasz
Radoszewski Jakub
Rytter Wojciech
Waleń Tomasz
Publication venue: 'Elsevier BV'
Publication date: 25/10/2017
Field of study

King's Research Portal

Linear Algorithm for Conservative Degenerate Pattern Matching

Author: Crochemore Maxime
Iliopoulos Costas S.
Kundu Ritu
Mohamed Manal
Vayani Fatima
Publication venue
Publication date: 15/06/2015
Field of study

A degenerate symbol x* over an alphabet A is a non-empty subset of A, and a sequence of such symbols is a degenerate string. A degenerate string is said to be conservative if its number of non-solid symbols is upper-bounded by a fixed positive constant k. We consider here the matching problem of conservative degenerate strings and present the first linear-time algorithm that can find, for given degenerate strings P* and T* of total length n containing k non-solid symbols in total, the occurrences of P* in T* in O(nk) time

arXiv.org e-Print Archive

Crossref

King's Research Portal

Rank and Select on Degenerate Strings

Author: Bille Philip
Gørtz Inge Li
Stordalen Tord
Publication venue
Publication date: 04/12/2023
Field of study

A 'degenerate string' is a sequence of subsets of some alphabet; it represents any string obtainable by selecting one character from each set from left to right. Recently, Alanko et al. generalized the rank-select problem to degenerate strings, where given a character

c

and position

i

the goal is to find either the

i

th set containing

c

or the number of occurrences of

c

in the first

i

sets [SEA 2023]. The problem has applications to pangenomics; in another work by Alanko et al. they use it as the basis for a compact representation of 'de Bruijn Graphs' that supports fast membership queries. In this paper we revisit the rank-select problem on degenerate strings, introducing a new, natural parameter and reanalyzing existing reductions to rank-select on regular strings. Plugging in standard data structures, the time bounds for queries are improved exponentially while essentially matching, or improving, the space bounds. Furthermore, we provide a lower bound on space that shows that the reductions lead to succinct data structures in a wide range of cases. Finally, we provide implementations; our most compact structure matches the space of the most compact structure of Alanko et al. while answering queries twice as fast. We also provide an implementation using modern vector processing features; it uses less than one percent more space than the most compact structure of Alanko et al. while supporting queries four to seven times faster, and has competitive query time with all the remaining structures

arXiv.org e-Print Archive