1,121 research outputs found

    Approximating the double-cut-and-join distance between unsigned genomes

    Get PDF
    In this paper we study the problem of sorting unsigned genomes by double-cut-and-join operations, where genomes allow a mix of linear and circular chromosomes to be present. First, we formulate an equivalent optimization problem, called maximum cycle/path decomposition, which is aimed at finding a largest collection of edge-disjoint cycles/AA-paths/AB-paths in a breakpoint graph. Then, we show that the problem of finding a largest collection of edge-disjoint cycles/AA-paths/AB-paths of length no more than l can be reduced to the well-known degree-bounded k-set packing problem with k = 2l. Finally, a polynomial-time approximation algorithm for the problem of sorting unsigned genomes by double-cut-and-join operations is devised, which achieves the approximation ratio for any positive ε. For the restricted variation where each genome contains only one linear chromosome, the approximation ratio can be further improved t

    Exact and Approximation Algorithms for Computing Reversal Distances in Genome Rearrangement

    Get PDF
    Genome rearrangement is a research area capturing wide attention in molecular biology. The reversal distance problem is one of the most widely studied models of genome rearrangements in inferring the evolutionary relationship between two genomes at chromosome level. The problem of estimating reversal distance between two genomes is modeled as sorting by reversals. While the problem of sorting signed permutations can have polynomial time solutions, the problem of sorting unsigned permutations has been proven to be NP-hard [4]. This work introduces an exact greedy algorithm for sorting by reversals focusing on unsigned permutations. An improved method of producing cycle decompositions for a 3/2-approximation algorithm and the consideration of 3-cycles for reversal sequences are also presented in this paper

    Faster Shortest Paths in Dense Distance Graphs, with Applications

    Full text link
    We show how to combine two techniques for efficiently computing shortest paths in directed planar graphs. The first is the linear-time shortest-path algorithm of Henzinger, Klein, Subramanian, and Rao [STOC'94]. The second is Fakcharoenphol and Rao's algorithm [FOCS'01] for emulating Dijkstra's algorithm on the dense distance graph (DDG). A DDG is defined for a decomposition of a planar graph GG into regions of at most rr vertices each, for some parameter r<nr < n. The vertex set of the DDG is the set of Θ(n/r)\Theta(n/\sqrt r) vertices of GG that belong to more than one region (boundary vertices). The DDG has Θ(n)\Theta(n) arcs, such that distances in the DDG are equal to the distances in GG. Fakcharoenphol and Rao's implementation of Dijkstra's algorithm on the DDG (nicknamed FR-Dijkstra) runs in O(nlog(n)r1/2logr)O(n\log(n) r^{-1/2} \log r) time, and is a key component in many state-of-the-art planar graph algorithms for shortest paths, minimum cuts, and maximum flows. By combining these two techniques we remove the logn\log n dependency in the running time of the shortest-path algorithm, making it O(nr1/2log2r)O(n r^{-1/2} \log^2r). This work is part of a research agenda that aims to develop new techniques that would lead to faster, possibly linear-time, algorithms for problems such as minimum-cut, maximum-flow, and shortest paths with negative arc lengths. As immediate applications, we show how to compute maximum flow in directed weighted planar graphs in O(nlogp)O(n \log p) time, where pp is the minimum number of edges on any path from the source to the sink. We also show how to compute any part of the DDG that corresponds to a region with rr vertices and kk boundary vertices in O(rlogk)O(r \log k) time, which is faster than has been previously known for small values of kk

    Total variation on a tree

    Full text link
    We consider the problem of minimizing the continuous valued total variation subject to different unary terms on trees and propose fast direct algorithms based on dynamic programming to solve these problems. We treat both the convex and the non-convex case and derive worst case complexities that are equal or better than existing methods. We show applications to total variation based 2D image processing and computer vision problems based on a Lagrangian decomposition approach. The resulting algorithms are very efficient, offer a high degree of parallelism and come along with memory requirements which are only in the order of the number of image pixels.Comment: accepted to SIAM Journal on Imaging Sciences (SIIMS

    Hitting and Harvesting Pumpkins

    Full text link
    The "c-pumpkin" is the graph with two vertices linked by c>0 parallel edges. A c-pumpkin-model in a graph G is a pair A,B of disjoint subsets of vertices of G, each inducing a connected subgraph of G, such that there are at least c edges in G between A and B. We focus on covering and packing c-pumpkin-models in a given graph: On the one hand, we provide an FPT algorithm running in time 2^O(k) n^O(1) deciding, for any fixed c>0, whether all c-pumpkin-models can be covered by at most k vertices. This generalizes known single-exponential FPT algorithms for Vertex Cover and Feedback Vertex Set, which correspond to the cases c=1,2 respectively. On the other hand, we present a O(log n)-approximation algorithm for both the problems of covering all c-pumpkin-models with a smallest number of vertices, and packing a maximum number of vertex-disjoint c-pumpkin-models.Comment: v2: several minor change

    LP Relaxation and Tree Packing for Minimum k-cuts

    Get PDF
    Karger used spanning tree packings [Karger, 2000] to derive a near linear-time randomized algorithm for the global minimum cut problem as well as a bound on the number of approximate minimum cuts. This is a different approach from his well-known random contraction algorithm [Karger, 1995; Karger and Stein, 1996]. Thorup developed a fast deterministic algorithm for the minimum k-cut problem via greedy recursive tree packings [Thorup, 2008]. In this paper we revisit properties of an LP relaxation for k-cut proposed by Naor and Rabani [Naor and Rabani, 2001], and analyzed in [Chekuri et al., 2006]. We show that the dual of the LP yields a tree packing, that when combined with an upper bound on the integrality gap for the LP, easily and transparently extends Karger\u27s analysis for mincut to the k-cut problem. In addition to the simplicity of the algorithm and its analysis, this allows us to improve the running time of Thorup\u27s algorithm by a factor of n. We also improve the bound on the number of alpha-approximate k-cuts. Second, we give a simple proof that the integrality gap of the LP is 2(1-1/n). Third, we show that an optimum solution to the LP relaxation, for all values of k, is fully determined by the principal sequence of partitions of the input graph. This allows us to relate the LP relaxation to the Lagrangean relaxation approach of Barahona [Barahona, 2000] and Ravi and Sinha [Ravi and Sinha, 2008]; it also shows that the idealized recursive tree packing considered by Thorup gives an optimum dual solution to the LP. This work arose from an effort to understand and simplify the results of Thorup [Thorup, 2008]

    Sobre modelos de rearranjo de genomas

    Get PDF
    Orientador: João MeidanisTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Rearranjo de genomas é o nome dado a eventos onde grandes blocos de DNA trocam de posição durante o processo evolutivo. Com a crescente disponibilidade de sequências completas de DNA, a análise desse tipo de eventos pode ser uma importante ferramenta para o entendimento da genômica evolutiva. Vários modelos matemáticos de rearranjo de genomas foram propostos ao longo dos últimos vinte anos. Nesta tese, desenvolvemos dois novos modelos. O primeiro foi proposto como uma definição alternativa ao conceito de distância de breakpoint. Essa distância é uma das mais simples medidas de rearranjo, mas ainda não há um consenso quanto à sua definição para o caso de genomas multi-cromossomais. Pevzner e Tesler deram uma definição em 2003 e Tannier et al. a definiram de forma diferente em 2008. Nesta tese, nós desenvolvemos uma outra alternativa, chamada de single-cut-or-join (SCJ). Nós mostramos que, no modelo SCJ, além da distância, vários problemas clássicos de rearranjo, como a mediana de rearranjo, genome halving e pequena parcimônia são fáceis, e apresentamos algoritmos polinomiais para eles. O segundo modelo que apresentamos é o formalismo algébrico por adjacências, uma extensão do formalismo algébrico proposto por Meidanis e Dias, que permite a modelagem de cromossomos lineares. Esta era a principal limitação do formalismo original, que só tratava de cromossomos circulares. Apresentamos algoritmos polinomiais para o cálculo da distância algébrica e também para encontrar cenários de rearranjo entre dois genomas. Também mostramos como calcular a distância algébrica através do grafo de adjacências, para facilitar a comparação com outras distâncias de rearranjo. Por fim, mostramos como modelar todas as operações clássicas de rearranjo de genomas utilizando o formalismo algébricoAbstract: Genome rearrangements are events where large blocks of DNA exchange places during evolution. With the growing availability of whole genome data, the analysis of these events can be a very important and promising tool for understanding evolutionary genomics. Several mathematical models of genome rearrangement have been proposed in the last 20 years. In this thesis, we propose two new rearrangement models. The first was introduced as an alternative definition of the breakpoint distance. The breakpoint distance is one of the most straightforward genome comparison measures, but when it comes to defining it precisely for multichromosomal genomes, there is more than one way to go about it. Pevzner and Tesler gave a definition in a 2003 paper, and Tannier et al. defined it differently in 2008. In this thesis we provide yet another alternative, calling it single-cut-or-join (SCJ). We show that several genome rearrangement problems, such as genome median, genome halving and small parsimony, become easy for SCJ, and provide polynomial time algorithms for them. The second model we introduce is the Adjacency Algebraic Theory, an extension of the Algebraic Formalism proposed by Meidanis and Dias that allows the modeling of linear chromosomes, the main limitation of the original formalism, which could deal with circular chromosomes only. We believe that the algebraic formalism is an interesting alternative for solving rearrangement problems, with a different perspective that could complement the more commonly used combinatorial graph-theoretic approach. We present polynomial time algorithms to compute the algebraic distance and find rearrangement scenarios between two genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. Finally, we show how all classic rearrangement operations can be modeled using the algebraic theoryDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

    Kernelization of Whitney Switches

    Get PDF
    A fundamental theorem of Whitney from 1933 asserts that 2-connected graphs G and H are 2-isomorphic, or equivalently, their cycle matroids are isomorphic, if and only if G can be transformed into H by a series of operations called Whitney switches. In this paper we consider the quantitative question arising from Whitney's theorem: Given two 2-isomorphic graphs, can we transform one into another by applying at most k Whitney switches? This problem is already NP-complete for cycles, and we investigate its parameterized complexity. We show that the problem admits a kernel of size O(k), and thus, is fixed-parameter tractable when parameterized by k.Comment: To appear at ESA 202

    The Alternating Stock Size Problem and the Gasoline Puzzle

    Full text link
    Given a set S of integers whose sum is zero, consider the problem of finding a permutation of these integers such that: (i) all prefix sums of the ordering are nonnegative, and (ii) the maximum value of a prefix sum is minimized. Kellerer et al. referred to this problem as the "Stock Size Problem" and showed that it can be approximated to within 3/2. They also showed that an approximation ratio of 2 can be achieved via several simple algorithms. We consider a related problem, which we call the "Alternating Stock Size Problem", where the number of positive and negative integers in the input set S are equal. The problem is the same as above, but we are additionally required to alternate the positive and negative numbers in the output ordering. This problem also has several simple 2-approximations. We show that it can be approximated to within 1.79. Then we show that this problem is closely related to an optimization version of the gasoline puzzle due to Lov\'asz, in which we want to minimize the size of the gas tank necessary to go around the track. We present a 2-approximation for this problem, using a natural linear programming relaxation whose feasible solutions are doubly stochastic matrices. Our novel rounding algorithm is based on a transformation that yields another doubly stochastic matrix with special properties, from which we can extract a suitable permutation
    corecore