4 research outputs found
A note on the shortest common superstring of NGS reads
The Shortest Superstring Problem (SSP) consists, for a set of strings S =
{s_1,...,s_n}, to find a minimum length string that contains all s_i, 1 <= i <=
k, as substrings. This problem is proved to be NP-Complete and APX-hard.
Guaranteed approximation algorithms have been proposed, the current best ratio
being 2+11/23, which has been achieved following a long and difficult quest.
However, SSP is highly used in practice on next generation sequencing (NGS)
data, which plays an increasingly important role in sequencing. In this note,
we show that the SSP approximation ratio can be improved on NGS reads by
assuming specific characteristics of NGS data that are experimentally verified
on a very large sampling set
On improving the approximation ratio of the r-shortest common superstring problem
The Shortest Common Superstring problem (SCS) consists, for a set of strings
S = {s_1,...,s_n}, in finding a minimum length string that contains all s_i,
1<= i <= n, as substrings. While a 2+11/30 approximation ratio algorithm has
recently been published, the general objective is now to break the conceptual
lower bound barrier of 2. This paper is a step ahead in this direction. Here we
focus on a particular instance of the SCS problem, meaning the r-SCS problem,
which requires all input strings to be of the same length, r. Golonev et al.
proved an approximation ratio which is better than the general one for r<= 6.
Here we extend their approach and improve their approximation ratio, which is
now better than the general one for r<= 7, and less than or equal to 2 up to r
= 6
On improving the approximation ratio of the r-shortest common superstring problem
The Shortest Common Superstring problem (SCS) consists, for a set of strings S = {s_1,...,s_n}, in finding a minimum length string that contains all s_i, 1<= i <= n, as substrings. While a 2+11/30 approximation ratio algorithm has recently been published, the general objective is now to break the conceptual lower bound barrier of 2. This paper is a step ahead in this direction. Here we focus on a particular instance of the SCS problem, meaning the r-SCS problem, which requires all input strings to be of the same length, r. Golonev et al. proved an approximation ratio which is better than the general one for r<= 6. Here we extend their approach and improve their approximation ratio, which is now better than the general one for r<= 7, and less than or equal to 2 up to r = 6
On improving the approximation ratio of the r-shortest common superstring problem
The Shortest Common Superstring problem (SCS) consists, for a set of strings S = {s_1,...,s_n}, in finding a minimum length string that contains all s_i, 1<= i <= n, as substrings. While a 2+11/30 approximation ratio algorithm has recently been published, the general objective is now to break the conceptual lower bound barrier of 2. This paper is a step ahead in this direction. Here we focus on a particular instance of the SCS problem, meaning the r-SCS problem, which requires all input strings to be of the same length, r. Golonev et al. proved an approximation ratio which is better than the general one for r<= 6. Here we extend their approach and improve their approximation ratio, which is now better than the general one for r<= 7, and less than or equal to 2 up to r = 6