200 research outputs found

    Genome Halving by Block Interchange

    Get PDF
    We address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem

    Genome Halving by Block Interchange

    No full text
    International audienceWe address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem

    Sobre modelos de rearranjo de genomas

    Get PDF
    Orientador: João MeidanisTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Rearranjo de genomas é o nome dado a eventos onde grandes blocos de DNA trocam de posição durante o processo evolutivo. Com a crescente disponibilidade de sequências completas de DNA, a análise desse tipo de eventos pode ser uma importante ferramenta para o entendimento da genômica evolutiva. Vários modelos matemáticos de rearranjo de genomas foram propostos ao longo dos últimos vinte anos. Nesta tese, desenvolvemos dois novos modelos. O primeiro foi proposto como uma definição alternativa ao conceito de distância de breakpoint. Essa distância é uma das mais simples medidas de rearranjo, mas ainda não há um consenso quanto à sua definição para o caso de genomas multi-cromossomais. Pevzner e Tesler deram uma definição em 2003 e Tannier et al. a definiram de forma diferente em 2008. Nesta tese, nós desenvolvemos uma outra alternativa, chamada de single-cut-or-join (SCJ). Nós mostramos que, no modelo SCJ, além da distância, vários problemas clássicos de rearranjo, como a mediana de rearranjo, genome halving e pequena parcimônia são fáceis, e apresentamos algoritmos polinomiais para eles. O segundo modelo que apresentamos é o formalismo algébrico por adjacências, uma extensão do formalismo algébrico proposto por Meidanis e Dias, que permite a modelagem de cromossomos lineares. Esta era a principal limitação do formalismo original, que só tratava de cromossomos circulares. Apresentamos algoritmos polinomiais para o cálculo da distância algébrica e também para encontrar cenários de rearranjo entre dois genomas. Também mostramos como calcular a distância algébrica através do grafo de adjacências, para facilitar a comparação com outras distâncias de rearranjo. Por fim, mostramos como modelar todas as operações clássicas de rearranjo de genomas utilizando o formalismo algébricoAbstract: Genome rearrangements are events where large blocks of DNA exchange places during evolution. With the growing availability of whole genome data, the analysis of these events can be a very important and promising tool for understanding evolutionary genomics. Several mathematical models of genome rearrangement have been proposed in the last 20 years. In this thesis, we propose two new rearrangement models. The first was introduced as an alternative definition of the breakpoint distance. The breakpoint distance is one of the most straightforward genome comparison measures, but when it comes to defining it precisely for multichromosomal genomes, there is more than one way to go about it. Pevzner and Tesler gave a definition in a 2003 paper, and Tannier et al. defined it differently in 2008. In this thesis we provide yet another alternative, calling it single-cut-or-join (SCJ). We show that several genome rearrangement problems, such as genome median, genome halving and small parsimony, become easy for SCJ, and provide polynomial time algorithms for them. The second model we introduce is the Adjacency Algebraic Theory, an extension of the Algebraic Formalism proposed by Meidanis and Dias that allows the modeling of linear chromosomes, the main limitation of the original formalism, which could deal with circular chromosomes only. We believe that the algebraic formalism is an interesting alternative for solving rearrangement problems, with a different perspective that could complement the more commonly used combinatorial graph-theoretic approach. We present polynomial time algorithms to compute the algebraic distance and find rearrangement scenarios between two genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. Finally, we show how all classic rearrangement operations can be modeled using the algebraic theoryDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

    Sorting by reversals, block interchanges, tandem duplications, and deletions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Finding sequences of evolutionary operations that transform one genome into another is a classic problem in comparative genomics. While most of the genome rearrangement algorithms assume that there is exactly one copy of each gene in both genomes, this does not reflect the biological reality very well – most of the studied genomes contain duplicated gene content, which has to be removed before applying those algorithms. However, dealing with unequal gene content is a very challenging task, and only few algorithms allow operations like duplications and deletions. Almost all of these algorithms restrict these operations to have a fixed size.</p> <p>Results</p> <p>In this paper, we present a heuristic algorithm to sort an ancestral genome (with unique gene content) into a genome of a descendant (with arbitrary gene content) by reversals, block interchanges, tandem duplications, and deletions, where tandem duplications and deletions are of arbitrary size.</p> <p>Conclusion</p> <p>Experimental results show that our algorithm finds sorting sequences that are close to an optimal sorting sequence when the ancestor and the descendant are closely related. The quality of the results decreases when the genomes get more diverged or the genome size increases. Nevertheless, the calculated distances give a good approximation of the true evolutionary distances.</p

    Multichromosomal median and halving problems under different genomic distances

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome median and genome halving are combinatorial optimization problems that aim at reconstructing ancestral genomes as well as the evolutionary events leading from the ancestor to extant species. Exploring complexity issues is a first step towards devising efficient algorithms. The complexity of the median problem for unichromosomal genomes (permutations) has been settled for both the breakpoint distance and the reversal distance. Although the multichromosomal case has often been assumed to be a simple generalization of the unichromosomal case, it is also a relaxation so that complexity in this context does not follow from existing results, and is open for all distances.</p> <p>Results</p> <p>We settle here the complexity of several genome median and halving problems, including a surprising polynomial result for the breakpoint median and guided halving problems in genomes with circular and linear chromosomes, showing that the multichromosomal problem is actually easier than the unichromosomal problem. Still other variants of these problems are NP-complete, including the DCJ double distance problem, previously mentioned as an open question. We list the remaining open problems.</p> <p>Conclusion</p> <p>This theoretical study clears up a wide swathe of the algorithmical study of genome rearrangements with multiple multichromosomal genomes.</p

    Genome aliquoting with double cut and join

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The <it>genome aliquoting probem </it>is, given an observed genome <it>A </it>with <it>n </it>copies of each gene, presumed to descend from an <it>n</it>-way polyploidization event from an ordinary diploid genome <it>B</it>, followed by a history of chromosomal rearrangements, to reconstruct the identity of the original genome <it>B'</it>. The idea is to construct <it>B'</it>, containing exactly one copy of each gene, so as to minimize the number of rearrangements <it>d</it>(<it>A, B' </it>⊕ <it>B' </it>⊕ ... ⊕ <it>B'</it>) necessary to convert the observed genome <it>B' </it>⊕ <it>B' </it>⊕ ... ⊕ <it>B' </it>into <it>A</it>.</p> <p>Results</p> <p>In this paper we make the first attempt to define and solve the genome aliquoting problem. We present a heuristic algorithm for the problem as well the data from our experiments demonstrating its validity.</p> <p>Conclusion</p> <p>The heuristic performs well, consistently giving a non-trivial result. The question as to the existence or non-existence of an exact solution to this problem remains open.</p

    On pairwise distances and median score of three genomes under DCJ

    Get PDF
    In comparative genomics, the rearrangement distance between two genomes (equal the minimal number of genome rearrangements required to transform them into a single genome) is often used for measuring their evolutionary remoteness. Generalization of this measure to three genomes is known as the median score (while a resulting genome is called median genome). In contrast to the rearrangement distance between two genomes which can be computed in linear time, computing the median score for three genomes is NP-hard. This inspires a quest for simpler and faster approximations for the median score, the most natural of which appears to be the halved sum of pairwise distances which in fact represents a lower bound for the median score. In this work, we study relationship and interplay of pairwise distances between three genomes and their median score under the model of Double-Cut-and-Join (DCJ) rearrangements. Most remarkably we show that while a rearrangement may change the sum of pairwise distances by at most 2 (and thus change the lower bound by at most 1), even the most "powerful" rearrangements in this respect that increase the lower bound by 1 (by moving one genome farther away from each of the other two genomes), which we call strong, do not necessarily affect the median score. This observation implies that the two measures are not as well-correlated as one's intuition may suggest. We further prove that the median score attains the lower bound exactly on the triples of genomes that can be obtained from a single genome with strong rearrangements. While the sum of pairwise distances with the factor 2/3 represents an upper bound for the median score, its tightness remains unclear. Nonetheless, we show that the difference of the median score and its lower bound is not bounded by a constant.Comment: Proceedings of the 10-th Annual RECOMB Satellite Workshop on Comparative Genomics (RECOMB-CG), 2012. (to appear
    • …
    corecore