549 research outputs found

    A 2-2/3 Approximation for the Shortest Superstring Problem

    Get PDF
    Given a collection of strings S={s_1, ..., s_n} over an alphabet \Sigma, a superstring \alpha of S is a string containing each s_i as a substring; that is, for each i, 1\u3c=i\u3c=n, \alpha contains a block of |s_i| consecutive characters that match s_i exactly. The shortest superstring problem is the problem of finding a superstring \alpha of minimum length. The shortest superstring problem has applications in both data compression and computational biology. In data compression, the problem is a part of a general model of string compression proposed by Gallant, Maier and Storer (JCSS \u2780). Much of the recent interest in the problem is due to its application to DNA sequence assembly. The problem has been shown to be NP-hard; in fact, it was shown by Blum et al.(JACM \u2794) to be MAX SNP-hard. The first O(1)-approximation was also due to Blum et al., who gave an algorithm that always returns a superstring no more than 3 times the length of an optimal solution. Several researchers have published results that improve on the approximation ratio; of these, the best previous result is our algorithm ShortString, which achieves a 2 3/4-approximation (WADS \u2795). We present our new algorithm, G-ShortString, which achieves a ratio of 2 2/3. It generalizes the ShortString algorithm, but the analysis differs substantially from that of ShortString. Our previous work identified classes of strings that have a nested periodic structure, and which must be present in the worst case for our algorithms. We introduced machinery to descibe these strings and proved strong structural properties about them. In this paper we extend this study to strings that exhibit a more relaxed form of the same structure, and we use this understanding to obtain our improved result

    Scheduling in a Ring with Unit Capacity Links

    Get PDF
    We consider the problem of scheduling unit-sized jobs on a ring of processors with the objective of minimizing the completion time of the last job. Unlike much previous work we place restrictions on the capacity of the network links connecting processors. We give a polynomial time centralized algorithm that produces optimal length schedules. We also give a simple distributed 2-approximation algorithm

    A New Approach to the Minumum Cut Problem

    Get PDF

    Approximating Disjoint-Path Problems Using Greedy Algorithms and Packing Integer Programs

    Get PDF
    In the edge(vertex)-disjoint path problem we are given a graph GG and a set T{\cal T} of connection requests. Every connection request in T{\cal T} is a vertex pair (si,ti),(s_i,t_i), 1≀i≀K.1 \leq i \leq K. The objective is to connect a maximum number of the pairs via edge(vertex)-disjoint paths. The edge-disjoint path problem can be generalized to the multiple-source unsplittable flow problem where connection request ii has a demand ρi\rho_i and every edge ee a capacity ue.u_e. All these problems are NP-hard and have a multitude of applications in areas such as routing, scheduling and bin packing. Given the hardness of the problem, we study polynomial-time approximation algorithms. In this context, a ρ\rho-approximation algorithm is able to route at least a 1/ρ1/\rho fraction of the connection requests. Although the edge- and vertex-disjoint path problems, and more recently the unsplittable flow generalization, have been extensively studied, they remain notoriously hard to approximate with a bounded performance guarantee. For example, even for the simple edge-disjoint path problem, no o(∣E∣)o(\sqrt{|E|})-approximation algorithm is known. Moreover some of the best existing approximation ratios are obtained through sophisticated and non-standard randomized rounding schemes. In this paper we introduce techniques which yield algorithms for a wide range of disjoint-path and unsplittable flow problems. For the general unsplittable flow problem, even with weights on the commodities, our techniques lead to the first approximation algorithm and obtain an approximation ratio that matches, to within logarithmic factors, the O(∣E∣)O(\sqrt{|E|}) approximation ratio for the simple edge-disjoint path problem. In addition to this result and to improved bounds for several disjoint-path problems, our techniques simplify and unify the derivation of many existing approximation results. We use two basic techniques. First, we propose simple greedy algorithms for edge- and vertex-disjoint paths and second, we propose the use of a framework based on packing integer programs for more general problems such as unsplittable flow. A packing integer program is of the form maximize cTβ‹…x,c^{T}\cdot x, subject to Ax≀b,Ax \leq b, A,b,cβ‰₯0.A,b,c \geq 0. As part of our tools we develop improved approximation algorithms for a class of packing integer programs, a result that we believe is of independent interest

    Finding Real-Valued Single-Source Shortest Paths in o(n^3) Expected Time

    Get PDF
    Given an nn-vertex directed network GG with real costs on the edges and a designated source vertex ss, we give a new algorithm to compute shortest paths from ss. Our algorithm is a simple deterministic one with O(n2log⁑n)O(n^2 \log n) expected running time over a large class of input distributions. The shortest path problem is an old and fundamental problem with a host of applications. Our algorithm is the first strongly-polynomial algorithm in over 35 years to improve upon some aspect of the running time of the celebrated Bellman-Ford algorithm for arbitrary networks, with any type of cost assignments

    Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity

    Get PDF
    Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data - data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model. In this paper, we study two fundamental problems, 2-edge connectivity and 2-vertex connectivity (biconnectivity). PRAM algorithms which run in O(log n) time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are, for an n-vertex, m-edge graph of diameter D and bi-diameter D\u27, 1) a O(log D log log_{m/n} n) parallel time 2-edge connectivity algorithm, 2) a O(log D log^2 log_{m/n}n+log D\u27log log_{m/n}n) parallel time biconnectivity algorithm, where the bi-diameter D\u27 is the largest cycle length over all the vertex pairs in the same biconnected component. Our results are fully scalable, meaning that the memory per processor can be O(n^{delta}) for arbitrary constant delta>0, and the total memory used is linear in the problem size. Our 2-edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [Andoni et al., 2018]. We also show an Omega(log D\u27) conditional lower bound for the biconnectivity problem

    Task Scheduling in Networks

    Get PDF
    Scheduling a set of tasks on a set of machines so as to yield an efficient schedule is a basic problem in computer science and operations research. Most of the research on this problem incorporates the potentially unrealistic assumption that communication between the different machines is instantaneous. In this paper we remove this assumption and study the problem of network scheduling, where each job originates at some node of a network, and in order to be processed at another node must take the time to travel through the network to that node. Our main contribution is to give approximation algorithms and hardness proofs for fully general forms of the fundamental problems in network scheduling. We consider two basic scheduling objectives: minimizing the makespan and minimizing the average completion time. For the makespan, we prove small constant factor hardness-to-approximate and approximation results. For the average completion time, we give a log-squared approximation algorithm for the most general form of the problem. The techniques used in this approximation are fairly general and have several other applications. For example, we give the first nontrivial approximation algorithm to minimize the average weighted completion time of a set of jobs on related or unrelated machines, with or without a network
    • …
    corecore