234 research outputs found

    Collapsing Superstring Conjecture

    Get PDF
    In the Shortest Common Superstring (SCS) problem, one is given a collection of strings, and needs to find a shortest string containing each of them as a substring. SCS admits 2 11/23-approximation in polynomial time (Mucha, SODA\u2713). While this algorithm and its analysis are technically involved, the 30 years old Greedy Conjecture claims that the trivial and efficient Greedy Algorithm gives a 2-approximation for SCS. We develop a graph-theoretic framework for studying approximation algorithms for SCS. The framework is reminiscent of the classical 2-approximation for Traveling Salesman: take two copies of an optimal solution, apply a trivial edge-collapsing procedure, and get an approximate solution. In this framework, we observe two surprising properties of SCS solutions, and we conjecture that they hold for all input instances. The first conjecture, that we call Collapsing Superstring conjecture, claims that there is an elementary way to transform any solution repeated twice into the same graph G. This conjecture would give an elementary 2-approximate algorithm for SCS. The second conjecture claims that not only the resulting graph G is the same for all solutions, but that G can be computed by an elementary greedy procedure called Greedy Hierarchical Algorithm. While the second conjecture clearly implies the first one, perhaps surprisingly we prove their equivalence. We support these equivalent conjectures by giving a proof for the special case where all input strings have length at most 3 (which until recently had been the only case where the Greedy Conjecture was proven). We also tested our conjectures on millions of instances of SCS. We prove that the standard Greedy Conjecture implies Greedy Hierarchical Conjecture, while the latter is sufficient for an efficient greedy 2-approximate approximation of SCS. Except for its (conjectured) good approximation ratio, the Greedy Hierarchical Algorithm provably finds a 3.5-approximation, and finds exact solutions for the special cases where we know polynomial time (not greedy) exact algorithms: (1) when the input strings form a spectrum of a string (2) when all input strings have length at most 2

    Computational Molecular Biology

    No full text
    Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography

    Superstrings with multiplicities

    Get PDF
    A superstring of a set of words P = s1, · · · , sp is a string that contains each word of P as substring. Given P, the well known Shortest Linear Superstring problem (SLS), asks for a shortest superstring of P. In a variant of SLS, called Multi-SLS, each word si comes with an integer m(i), its multiplicity, that sets a constraint on its number of occurrences, and the goal is to find a shortest superstring that contains at least m(i) occurrences of si. Multi-SLS generalizes SLS and is obviously as hard to solve, but it has been studied only in special cases (with words of length 2 or with a fixed number of words). The approximability of Multi-SLS in the general case remains open. Here, we study the approximability of Multi-SLS and that of the companion problem Multi-SCCS, which asks for a shortest cyclic cover instead of shortest superstring. First, we investigate the approximation of a greedy algorithm for maximizing the compression offered by a superstring or by a cyclic cover: the approximation ratio is 1/2 for Multi-SLS and 1 for Multi-SCCS. Then, we exhibit a linear time approximation algorithm, Concat-Greedy, and show it achieves a ratio of 4 regarding the superstring length. This demonstrates that for both measures Multi-SLS belongs to the class of APX problems. © 2018 Yoshifumi Sakai; licensed under Creative Commons License CC-BY.Peer reviewe

    Cosmic String Loop Microlensing

    Get PDF
    Cosmic superstring loops within the galaxy microlens background point sources lying close to the observer-string line of sight. For suitable alignments, multiple paths coexist and the (achromatic) flux enhancement is a factor of two. We explore this unique type of lensing by numerically solving for geodesics that extend from source to observer as they pass near an oscillating string. We characterize the duration of the flux doubling and the scale of the image splitting. We probe and confirm the existence of a variety of fundamental effects predicted from previous analyses of the static infinite straight string: the deficit angle, the Kaiser-Stebbins effect, and the scale of the impact parameter required to produce microlensing. Our quantitative results for dynamical loops vary by O(1) factors with respect to estimates based on infinite straight strings for a given impact parameter. A number of new features are identified in the computed microlensing solutions. Our results suggest that optical microlensing can offer a new and potentially powerful methodology for searches for superstring loop relics of the inflationary era.Comment: 20 pages, 19 figure

    On Approximability of Bounded Degree Instances of Selected Optimization Problems

    Get PDF
    In order to cope with the approximation hardness of an underlying optimization problem, it is advantageous to consider specific families of instances with properties that can be exploited to obtain efficient approximation algorithms for the restricted version of the problem with improved performance guarantees. In this thesis, we investigate the approximation complexity of selected NP-hard optimization problems restricted to instances with bounded degree, occurrence or weight parameter. Specifically, we consider the family of dense instances, where typically the average degree is bounded from below by some function of the size of the instance. Complementarily, we examine the family of sparse instances, in which the average degree is bounded from above by some fixed constant. We focus on developing new methods for proving explicit approximation hardness results for general as well as for restricted instances. The fist part of the thesis contributes to the systematic investigation of the VERTEX COVER problem in k-hypergraphs and k-partite k-hypergraphs with density and regularity constraints. We design efficient approximation algorithms for the problems with improved performance guarantees as compared to the general case. On the other hand, we prove the optimality of our approximation upper bounds under the Unique Games Conjecture or a variant. In the second part of the thesis, we study mainly the approximation hardness of restricted instances of selected global optimization problems. We establish improved or in some cases the first inapproximability thresholds for the problems considered in this thesis such as the METRIC DIMENSION problem restricted to graphs with maximum degree 3 and the (1,2)-STEINER TREE problem. We introduce a new reductions method for proving explicit approximation lower bounds for problems that are related to the TRAVELING SALESPERSON (TSP) problem. In particular, we prove the best up to now inapproximability thresholds for the general METRIC TSP problem, the ASYMMETRIC TSP problem, the SHORTEST SUPERSTRING problem, the MAXIMUM TSP problem and TSP problems with bounded metrics

    Shortest common superstring approximaation nopea toteutus sekä soveltaminen relative lempel-ziv pakkaukseen

    Get PDF
    The objective of the shortest common superstring problem is to find a string of minimum length that contains all keywords in the given input as substrings. Shortest common superstrings have many applications in the fields of data compression and bioinformatics. For example, a common superstring can be seen as a compressed form of the keywords it is generated from. Since the shortest common superstring problem is NP-hard, we focus on the approximation algorithms that implement a so-called greed heuristic. It turns out that the actual shortest common superstring is not always needed. Instead, it is often enough to find an approximate solution of sufficient quality. We provide an implementation of the Ukkonen's linear time algorithm for the greedy heuristic. The practical performance of this implementation is measured by comparing it to another implementation of the same heuristic. We also hypothesize that shortest common superstrings can be potentially used to improve the compression ratio of the Relative Lempel-Ziv data compression algorithm. This hypothesis is examined and shown to be valid

    Approximating (k,â„“)(k,\ell)-center clustering for curves

    Get PDF
    The Euclidean kk-center problem is a classical problem that has been extensively studied in computer science. Given a set G\mathcal{G} of nn points in Euclidean space, the problem is to determine a set C\mathcal{C} of kk centers (not necessarily part of G\mathcal{G}) such that the maximum distance between a point in G\mathcal{G} and its nearest neighbor in C\mathcal{C} is minimized. In this paper we study the corresponding (k,ℓ)(k,\ell)-center problem for polygonal curves under the Fr\'echet distance, that is, given a set G\mathcal{G} of nn polygonal curves in Rd\mathbb{R}^d, each of complexity mm, determine a set C\mathcal{C} of kk polygonal curves in Rd\mathbb{R}^d, each of complexity ℓ\ell, such that the maximum Fr\'echet distance of a curve in G\mathcal{G} to its closest curve in C\mathcal{C} is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 22 and higher. We show that, if ℓ\ell is part of the input, then there is no polynomial-time approximation scheme unless P=NP\mathsf{P}=\mathsf{NP}. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fr\'echet distance. In the case of the discrete Fr\'echet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.5982.598. This result also holds when k=1k=1, and the NP\mathsf{NP}-hardness extends to the case that ℓ=∞\ell=\infty, i.e., for the problem of computing the minimum-enclosing ball under the Fr\'echet distance. Finally, we observe that a careful adaptation of Gonzalez' algorithm in combination with a curve simplification yields a 33-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Comment: 24 pages; results on minimum-enclosing ball added, additional author added, general revisio

    Minimum-weight Cycle Covers and Their Approximability

    Get PDF
    A cycle cover of a graph is a set of cycles such that every vertex is part of exactly one cycle. An L-cycle cover is a cycle cover in which the length of every cycle is in the set L. We investigate how well L-cycle covers of minimum weight can be approximated. For undirected graphs, we devise a polynomial-time approximation algorithm that achieves a constant approximation ratio for all sets L. On the other hand, we prove that the problem cannot be approximated within a factor of 2-eps for certain sets L. For directed graphs, we present a polynomial-time approximation algorithm that achieves an approximation ratio of O(n), where nn is the number of vertices. This is asymptotically optimal: We show that the problem cannot be approximated within a factor of o(n). To contrast the results for cycle covers of minimum weight, we show that the problem of computing L-cycle covers of maximum weight can, at least in principle, be approximated arbitrarily well.Comment: To appear in the Proceedings of the 33rd Workshop on Graph-Theoretic Concepts in Computer Science (WG 2007). Minor change
    • …
    corecore