31 research outputs found

    Genome Halving by Block Interchange

    Get PDF
    We address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem

    On the PATHGROUPS approach to rapid small phylogeny

    Get PDF
    We present a data structure enabling rapid heuristic solution to the ancestral genome reconstruction problem for given phylogenies under genomic rearrangement metrics. The efficiency of the greedy algorithm is due to fast updating of the structure during run time and a simple priority scheme for choosing the next step. Since accuracy deteriorates for sets of highly divergent genomes, we investigate strategies for improving accuracy and expanding the range of data sets where accurate reconstructions can be expected. This includes a more refined priority system, and a two-step look-ahead, as well as iterative local improvements based on a the median version of the problem, incorporating simulated annealing. We apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny

    Sorting genomes with rearrangements and segmental duplications through trajectory graphs

    Get PDF
    We study the problem of sorting genomes under an evolutionary model that includes genomic rearrangements and segmental duplications. We propose an iterative algorithm to improve any initial evolutionary trajectory between two genomes in terms of parsimony. Our algorithm is based on a new graphical model, the trajectory graph, which models not only the final states of two genomes but also an existing evolutionary trajectory between them. We show that redundant rearrangements in the trajectory correspond to certain cycles in the trajectory graph, and prove that our algorithm converges to an optimal trajectory for any initial trajectory involving only rearrangements

    Genome Halving by Block Interchange

    No full text
    International audienceWe address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem

    Fractionation statistics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Paralog reduction, the loss of duplicate genes after whole genome duplication (WGD) is a pervasive process. Whether this loss proceeds gene by gene or through deletion of multi-gene DNA segments is controversial, as is the question of fractionation bias, namely whether one homeologous chromosome is more vulnerable to gene deletion than the other.</p> <p>Results</p> <p>As a null hypothesis, we first assume deletion events, on one homeolog only, excise a geometrically distributed number of genes with unknown mean <it>µ</it>, and these events combine to produce deleted runs of length l, distributed approximately as a negative binomial with unknown parameter <it>r</it>, itself a random variable with distribution <it>π</it>(·). A more realistic model requires deletion events on both homeologs distributed as a truncated geometric. We simulate the distribution of run lengths <it>l</it> in both models, as well as the underlying <it>π</it>(<it>r</it>), as a function of <it>µ</it>, and show how sampling <it>l</it> allows us to estimate <it>µ</it>. We apply this to data on a total of 15 genomes descended from 6 distinct WGD events and show how to correct the bias towards shorter runs caused by genome rearrangements. Because of the difficulty in deriving <it>π</it>(·) analytically, we develop a deterministic recurrence to calculate each <it>π</it>(<it>r</it>) as a function of <it>µ</it> and the proportion of unreduced paralog pairs.</p> <p>Conclusions</p> <p>The parameter <it>µ</it> can be estimated based on run lengths of single-copy regions. Estimates of <it>µ</it> in real data do not exclude the possibility that duplicate gene deletion is largely gene by gene, although it may sometimes involve longer segments.</p

    The collapse of gene complement following whole genome duplication

    Get PDF
    Abstract Background Genome amplification through duplication or proliferation of transposable elements has its counterpart in genome reduction, by elimination of DNA or by gene inactivation. Whether loss is primarily due to excision of random length DNA fragments or the inactivation of one gene at a time is controversial. Reduction after whole genome duplication (WGD) represents an inexorable collapse in gene complement. Results We compare fifteen genomes descending from six eukaryotic WGD events 20-450 Mya. We characterize the collapse over time through the distribution of runs of reduced paralog pairs in duplicated segments. Descendant genomes of the same WGD event behave as replicates. Choice of paralog pairs to be reduced is random except for some resistant regions of contiguous pairs. For those paralog pairs that are reduced, conserved copies tend to concentrate on one chromosome. Conclusions Both the contiguous regions of reduction-resistant pairs and the concentration of runs of single copy genes on a single chromosome are evidence of transcriptional co-regulation, dosage sensitivity or other functional interaction constraining the reduction process. These constraints and their evolution over time show a consistent pattern across evolutionary domains and a highly reproducible pattern, as replicates, for the several descendants of a single WGD

    Genome aliquoting with double cut and join

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The <it>genome aliquoting probem </it>is, given an observed genome <it>A </it>with <it>n </it>copies of each gene, presumed to descend from an <it>n</it>-way polyploidization event from an ordinary diploid genome <it>B</it>, followed by a history of chromosomal rearrangements, to reconstruct the identity of the original genome <it>B'</it>. The idea is to construct <it>B'</it>, containing exactly one copy of each gene, so as to minimize the number of rearrangements <it>d</it>(<it>A, B' </it>⊕ <it>B' </it>⊕ ... ⊕ <it>B'</it>) necessary to convert the observed genome <it>B' </it>⊕ <it>B' </it>⊕ ... ⊕ <it>B' </it>into <it>A</it>.</p> <p>Results</p> <p>In this paper we make the first attempt to define and solve the genome aliquoting problem. We present a heuristic algorithm for the problem as well the data from our experiments demonstrating its validity.</p> <p>Conclusion</p> <p>The heuristic performs well, consistently giving a non-trivial result. The question as to the existence or non-existence of an exact solution to this problem remains open.</p

    Sorting by reversals, block interchanges, tandem duplications, and deletions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Finding sequences of evolutionary operations that transform one genome into another is a classic problem in comparative genomics. While most of the genome rearrangement algorithms assume that there is exactly one copy of each gene in both genomes, this does not reflect the biological reality very well – most of the studied genomes contain duplicated gene content, which has to be removed before applying those algorithms. However, dealing with unequal gene content is a very challenging task, and only few algorithms allow operations like duplications and deletions. Almost all of these algorithms restrict these operations to have a fixed size.</p> <p>Results</p> <p>In this paper, we present a heuristic algorithm to sort an ancestral genome (with unique gene content) into a genome of a descendant (with arbitrary gene content) by reversals, block interchanges, tandem duplications, and deletions, where tandem duplications and deletions are of arbitrary size.</p> <p>Conclusion</p> <p>Experimental results show that our algorithm finds sorting sequences that are close to an optimal sorting sequence when the ancestor and the descendant are closely related. The quality of the results decreases when the genomes get more diverged or the genome size increases. Nevertheless, the calculated distances give a good approximation of the true evolutionary distances.</p

    Guided genome halving: hardness, heuristics and the history of the Hemiascomycetes

    Get PDF
    Motivation: Some present day species have incurred a whole genome doubling event in their evolutionary history, and this is reflected today in patterns of duplicated segments scattered throughout their chromosomes. These duplications may be used as data to ‘halve’ the genome, i.e. to reconstruct the ancestral genome at the moment of doubling, but the solution is often highly nonunique. To resolve this problem, we take account of outgroups, external reference genomes, to guide and narrow down the search
    corecore