246 research outputs found

    The rise and fall of breakpoint reuse depending on genome resolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>During evolution, large-scale genome rearrangements of chromosomes shuffle the order of homologous genome sequences ("synteny blocks") across species. Some years ago, a controversy erupted in genome rearrangement studies over whether rearrangements recur, causing breakpoints to be reused.</p> <p>Methods</p> <p>We investigate this controversial issue using the synteny block's for human-mouse-rat reported by Bourque <it>et al</it>. and a series of synteny blocks we generated using Mauve at resolutions ranging from coarse to very fine-scale. We conducted analyses to test how resolution affects the traditional measure of the breakpoint reuse rate<it>.</it></p> <p>Results</p> <p>We found that the inversion-based breakpoint reuse rate is low at fine-scale synteny block resolution and that it rises and eventually falls as synteny block resolution decreases. By analyzing the cycle structure of the breakpoint graph of human-mouse-rat synteny blocks for human-mouse and comparing with theoretically derived distributions for random genome rearrangements, we showed that the implied genome rearrangements at each level of resolution become more “random” as synteny block resolution diminishes. At highest synteny block resolutions the Hannenhalli-Pevzner inversion distance deviates from the Double Cut and Join distance, possibly due to small-scale transpositions or simply due to inclusion of erroneous synteny blocks. At synteny block resolutions as coarse as the Bourque <it>et al</it>. blocks, we show the breakpoint graph cycle structure has already converged to the pattern expected for a random distribution of synteny blocks.</p> <p>Conclusions</p> <p>The inferred breakpoint reuse rate depends on synteny block resolution in human-mouse genome comparisons. At fine-scale resolution, the cycle structure for the transformation appears less random compared to that for coarse resolution. Small synteny blocks may contain critical information for accurate reconstruction of genome rearrangement history and parameters.</p

    Evolution of whole genomes through inversions:models and algorithms for duplicates, ancestors, and edit scenarios

    Get PDF
    Advances in sequencing technology are yielding DNA sequence data at an alarming rate – a rate reminiscent of Moore's law. Biologists' abilities to analyze this data, however, have not kept pace. On the other hand, the discrete and mechanical nature of the cell life-cycle has been tantalizing to computer scientists. Thus in the 1980s, pioneers of the field now called Computational Biology began to uncover a wealth of computer science problems, some confronting modern Biologists and some hidden in the annals of the biological literature. In particular, many interesting twists were introduced to classical string matching, sorting, and graph problems. One such problem, first posed in 1941 but rediscovered in the early 1980s, is that of sorting by inversions (also called reversals): given two permutations, find the minimum number of inversions required to transform one into the other, where an inversion inverts the order of a subpermutation. Indeed, many genomes have evolved mostly or only through inversions. Thus it becomes possible to trace evolutionary histories by inferring sequences of such inversions that led to today's genomes from a distant common ancestor. But unlike the classic edit distance problem where string editing was relatively simple, editing permutation in this way has proved to be more complex. In this dissertation, we extend the theory so as to make these edit distances more broadly applicable and faster to compute, and work towards more powerful tools that can accurately infer evolutionary histories. In particular, we present work that for the first time considers genomic distances between any pair of genomes, with no limitation on the number of occurrences of a gene. Next we show that there are conditions under which an ancestral genome (or one close to the true ancestor) can be reliably reconstructed. Finally we present new methodology that computes a minimum-length sequence of inversions to transform one permutation into another in, on average, O(n log n) steps, whereas the best worst-case algorithm to compute such a sequence uses O(n√n log n) steps

    Approximating reversal distance for strings with bounded number of duplicates

    Get PDF
    AbstractFor a string A=a1…an, a reversal ρ(i,j), 1⩽i⩽j⩽n, transforms the string A into a string A′=a1…ai-1ajaj-1…aiaj+1… an, that is, the reversal ρ(i,j) reverses the order of symbols in the substring ai…aj of A. In the case of signed strings, where each symbol is given a sign + or -, the reversal operation also flips the sign of each symbol in the reversed substring. Given two strings, A and B, signed or unsigned, sorting by reversals (SBR) is the problem of finding the minimum number of reversals that transform the string A into the string B.Traditionally, the problem was studied for permutations, that is, for strings in which every symbol appears exactly once. We consider a generalization of the problem, k-SBR, and allow each symbol to appear at most k times in each string, for some k⩾1. The main result of the paper is an O(k2)-approximation algorithm running in time O(n). For instances with 3<k⩽O(lognlog*n), this is the best known approximation algorithm for k-SBRand, moreover, it is faster than the previous best approximation algorithm

    Genome dedoubling by DCJ and reversal

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Segmental duplications in genomes have been studied for many years. Recently, several studies have highlighted a biological phenomenon called <it>breakpoint-duplication</it> that apparently associates a significant proportion of segmental duplications in Mammals, and the Drosophila species group, to breakpoints in rearrangement events.</p> <p>Results</p> <p>In this paper, we introduce and study a combinatorial problem, inspired from the breakpoint-duplication phenomenon, called the <it>Genome Dedoubling Problem.</it> It consists of finding a minimum length rearrangement scenario required to transform a genome with duplicated segments into a non-duplicated genome such that duplications are caused by rearrangement breakpoints. We show that the problem, in the Double-Cut-and-Join (DCJ) and the reversal rearrangement models, can be reduced to an APX-complete problem, and we provide algorithms for the Genome Dedoubling Problem with 2-approximable parts. We apply the methods for the reconstruction of a non-duplicated ancestor of <it>Drosophila yakuba.</it></p> <p>Conclusions</p> <p>We present the <it>Genome Dedoubling Problem</it>, and describe two algorithms solving the problem in the DCJ model, and the reversal model. The usefulness of the problems and the methods are showed through an application to real Drosophila data.</p

    Degenerate crossing number and signed reversal distance

    Full text link
    The degenerate crossing number of a graph is the minimum number of transverse crossings among all its drawings, where edges are represented as simple arcs and multiple edges passing through the same point are counted as a single crossing. Interpreting each crossing as a cross-cap induces an embedding into a non-orientable surface. In 2007, Mohar showed that the degenerate crossing number of a graph is at most its non-orientable genus and he conjectured that these quantities are equal for every graph. He also made the stronger conjecture that this also holds for any loopless pseudotriangulation with a fixed embedding scheme. In this paper, we prove a structure theorem that almost completely classifies the loopless 2-vertex embedding schemes for which the degenerate crossing number equals the non-orientable genus. In particular, we provide a counterexample to Mohar's stronger conjecture, but show that in the vast majority of the 2-vertex cases, the conjecture does hold. The reversal distance between two signed permutations is the minimum number of reversals that transform one permutation to the other one. If we represent the trajectory of each element of a signed permutation under successive reversals by a simple arc, we obtain a drawing of a 2-vertex embedding scheme with degenerate crossings. Our main result is proved by leveraging this connection and a classical result in genome rearrangement (the Hannenhali-Pevzner algorithm) and can also be understood as an extension of this algorithm when the reversals do not necessarily happen in a monotone order.Comment: Appears in the Proceedings of the 31st International Symposium on Graph Drawing and Network Visualization (GD 2023

    Effect of scale on long-range random graphs and chromosomal inversions

    Get PDF
    We consider bond percolation on nn vertices on a circle where edges are permitted between vertices whose spacing is at most some number L=L(n). We show that the resulting random graph gets a giant component when L(logn)2L\gg(\log n)^2 (when the mean degree exceeds 1) but not when LlognL\ll\log n. The proof uses comparisons to branching random walks. We also consider a related process of random transpositions of nn particles on a circle, where transpositions only occur again if the spacing is at most LL. Then the process exhibits the mean-field behavior described by Berestycki and Durrett if and only if L(n) tends to infinity, no matter how slowly. Thus there are regimes where the random graph has no giant component but the random walk nevertheless has a phase transition. We discuss possible relevance of these results for a dataset coming from D. repleta and D. melanogaster and for the typical length of chromosomal inversions.Comment: Published in at http://dx.doi.org/10.1214/11-AAP793 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore