Search CORE

7 research outputs found

Longest property-preserved common factor

Author: D Belazzougui
D Gusfield
H Bannai
J-P Duval
L Chi
M Dumitran
M Farach
M Federico
M Lothaire
P Peterlongo
P Peterlongo
S Inenaga
SR Chowdhury
SV Thankachan
SV Thankachan
SW Bae
T Kociumaka
T Starikovskaya
WI Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

In this paper we introduce a new family of string processing problems. We are given two or more strings and we are asked to compute a factor common to all strings that preserves a specific property and has maximal length. Here we consider two fundamental string properties: square-free factors and periodic factors under two different settings, one per property. In the first setting, we are given a string x and we are asked to construct a data structure over x answering the following type of on-line queries: given string y, find a longest square-free factor common to x and y. In the second setting, we are given k strings and an integer 1 < k’ ≤ k and we are asked to find a longest periodic factor common to at least k’ strings. We present linear-time solutions for both settings. We anticipate that our paradigm can be extended to other string properties

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Trieste

Crossref

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università di Pisa

King's Research Portal

Weighted Shortest Common Supersequence problem revisited

Author: A Amir
A Amir
A Amir
C Barton
C Barton
CC Aggarwal
D Lokshtanov
D Maier
E Horowitz
K Räihä
M Cygan
N Bansal
P Charalampopoulos
P Charalampopoulos
R Impagliazzo
R Impagliazzo
T Jiang
T Kociumaka
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/09/2019
Field of study

A weighted string, also known as a position weight matrix, is a sequence of probability distributions over some alphabet. We revisit the Weighted Shortest Common Supersequence (WSCS) problem, introduced by Amir et al. [SPIRE 2011], that is, the SCS problem on weighted strings. In the WSCS problem, we are given two weighted strings (Formula presented) and (Formula presented) and a threshold (Formula presented) on probability, and we are asked to compute the shortest (standard) string S such that both (Formula presented) and (Formula presented) match subsequences of S (not necessarily the same

arXiv.org e-Print Archive

Crossref

CWI's Institutional Repository

Longest Property-Preserved Common Factor

Author: Ayad Lorraine,
Bernardini Giulia
Grossi Roberto,
Iliopoulos Costas,
Pisanti Nadia
Pissis Solon,
Rosone Giovanna
Publication venue: HAL CCSD
Publication date: 01/01/2018
Field of study

International audienceIn this paper we introduce a new family of string processing problems. We are given two or more strings and we are asked to compute a factor common to all strings that preserves a specific property and has maximal length. Here we consider three fundamental string properties: square-free factors, periodic factors, and palindromic factors under three different settings, one per property. In the first setting, we are given a string x and we are asked to construct a data structure over x answering the following type of on-line queries: given string y, find a longest square-free factor common to x and y. In the second setting, we are given k strings and an integer 1 < k ≤ k and we are asked to find a longest periodic factor common to at least k strings. In the third setting, we are given two strings and we are asked to find a longest palindromic factor common to the two strings. We present linear-time solutions for all settings. We anticipate that our paradigm can be extended to other string properties or settings

INRIA a CCSD electronic archive server

Indexing weighted sequences: Neat and efficient

Author: Barton C. (Carl)
Kociumaka T. (Tomasz)
Liu C. (Chang)
Pissis S. (Solon)
Radoszewski J. (Jakub)
Publication venue: 'Elsevier BV'
Publication date: 04/09/2019
Field of study

In a weighted sequence, for every position of the sequence and every letter of the alphabet a probability of occurrence of this letter at this position is specified. Weighted sequences are commonly used to represent imprecise or uncertain data, for example in molecular biology, where they are known under the name of Position Weight Matrices. Given a probability threshold 1/z , we say that a string P of length m occurs in a weighted sequence X at position i if the product of probabilities of the letters of P at positions i, . . . , i+m−1 in X is at least 1/z . In this article, we consider an indexing variant of the problem, in which we are to pre-process a weighted sequence to answer multiple pattern matching queries. We present an O(nz)-time construction of an O(nz)-sized index for a weighted sequence of length n that answers pattern matching queries in the optimal O(m+Occ) time, where Occ is the number of occurrences reported. The cornerstone of our data structure is a novel construction of a family of [z] strings that carries the information about all the strings that occur in the weighted sequence with a sufficient probability. We thus improve the most efficient previously known index by Amir et al. (Theor. Comput. Sci., 2008) with size and construction time O(nz2 log z), preserving optimal query time. On the way we develop a new, more straightforward index for the so-called property matching problem. We provide an open-source implementation of our data structure and present experimental results using both synthetic and real data. Our construction allows us also to obtain a significant improvement over the complexities of the approximate variant of the weighted index presented by Biswas et al. at EDBT 2016 and an improvement of the space complexity of their general index. We also present applications of our index

CWI's Institutional Repository

Property Suffix Array with Applications in Indexing Weighted Sequences

Author: Charalampopoulos Panagiotis
Iliopoulos Costas S.
Liu Chang
Pissis Solon P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2020
Field of study

King's Research Portal