5 research outputs found

    A note on the shortest common superstring of NGS reads

    Full text link
    The Shortest Superstring Problem (SSP) consists, for a set of strings S = {s_1,...,s_n}, to find a minimum length string that contains all s_i, 1 <= i <= k, as substrings. This problem is proved to be NP-Complete and APX-hard. Guaranteed approximation algorithms have been proposed, the current best ratio being 2+11/23, which has been achieved following a long and difficult quest. However, SSP is highly used in practice on next generation sequencing (NGS) data, which plays an increasingly important role in sequencing. In this note, we show that the SSP approximation ratio can be improved on NGS reads by assuming specific characteristics of NGS data that are experimentally verified on a very large sampling set

    Cautionary Tales of Inapproximability

    Get PDF
    Modeling biology as classical problems in computer science allows researchers to leverage the wealth of theoretical advancements in this field. Despite countless studies presenting heuristics that report improvement on specific benchmarking data, there has been comparatively little focus on exploring the theoretical bounds on the performance of practical (polynomial-time) algorithms. Conversely, theoretical studies tend to overstate the generalizability of their conclusions to physical biological processes. In this article we provide a fresh perspective on the concepts of NP-hardness and inapproximability in the computational biology domain, using popular sequence assembly and alignment (mapping) algorithms as illustrative examples. These algorithms exemplify how computer science theory can both (a) lead to substantial improvement in practical performance and (b) highlight areas ripe for future innovation. Importantly, we discuss caveats that seemingly allow the performance of heuristics to exceed their provable bounds

    On improving the approximation ratio of the r-shortest common superstring problem

    Full text link
    The Shortest Common Superstring problem (SCS) consists, for a set of strings S = {s_1,...,s_n}, in finding a minimum length string that contains all s_i, 1<= i <= n, as substrings. While a 2+11/30 approximation ratio algorithm has recently been published, the general objective is now to break the conceptual lower bound barrier of 2. This paper is a step ahead in this direction. Here we focus on a particular instance of the SCS problem, meaning the r-SCS problem, which requires all input strings to be of the same length, r. Golonev et al. proved an approximation ratio which is better than the general one for r<= 6. Here we extend their approach and improve their approximation ratio, which is now better than the general one for r<= 7, and less than or equal to 2 up to r = 6

    On Approximability of Bounded Degree Instances of Selected Optimization Problems

    Get PDF
    In order to cope with the approximation hardness of an underlying optimization problem, it is advantageous to consider specific families of instances with properties that can be exploited to obtain efficient approximation algorithms for the restricted version of the problem with improved performance guarantees. In this thesis, we investigate the approximation complexity of selected NP-hard optimization problems restricted to instances with bounded degree, occurrence or weight parameter. Specifically, we consider the family of dense instances, where typically the average degree is bounded from below by some function of the size of the instance. Complementarily, we examine the family of sparse instances, in which the average degree is bounded from above by some fixed constant. We focus on developing new methods for proving explicit approximation hardness results for general as well as for restricted instances. The fist part of the thesis contributes to the systematic investigation of the VERTEX COVER problem in k-hypergraphs and k-partite k-hypergraphs with density and regularity constraints. We design efficient approximation algorithms for the problems with improved performance guarantees as compared to the general case. On the other hand, we prove the optimality of our approximation upper bounds under the Unique Games Conjecture or a variant. In the second part of the thesis, we study mainly the approximation hardness of restricted instances of selected global optimization problems. We establish improved or in some cases the first inapproximability thresholds for the problems considered in this thesis such as the METRIC DIMENSION problem restricted to graphs with maximum degree 3 and the (1,2)-STEINER TREE problem. We introduce a new reductions method for proving explicit approximation lower bounds for problems that are related to the TRAVELING SALESPERSON (TSP) problem. In particular, we prove the best up to now inapproximability thresholds for the general METRIC TSP problem, the ASYMMETRIC TSP problem, the SHORTEST SUPERSTRING problem, the MAXIMUM TSP problem and TSP problems with bounded metrics
    corecore