2,482 research outputs found

    Comparative genomics: multiple genome rearrangement and efficient algorithm development

    Get PDF
    Multiple genome rearrangement by signed reversal is discussed: For a collection of genomes represented by signed permutations, reconstruct their evolutionary history by using signed reversals, i.e. find a bifurcating tree where sampled genomes are assigned to leaf nodes and ancestral genomes (i.e. signed permutations) are hypothesized at internal nodes such that the total reversal distance summed over all edges of the tree is minimized. It is equivalent to finding an optimal Steiner tree that connects the given genomes by signed reversal paths. The key for the problem is to reconstruct all optimal Steiner nodes/ancestral genomes.;The problem is NP-hard and can only be solved by efficient approximation algorithms. Various algorithms/programs have been designed to solve the problem, such as BPAnalysis, GRAPPA, grid search algorithm, MGR greedy split algorithm (Chapter 1). However, they may have expensive computational costs or low inference accuracy. In this thesis, several new algorithms are developed, including nearest path search algorithm (Chapter 2), neighbor-perturbing algorithm (Chapter 3), branch-and-bound algorithm (Chapter 3), perturbing-improving algorithm (Chapter 4), partitioning algorithm (Chapter 5), etc. With theoretical proofs, computer simulations, and biological applications, these algorithms are shown to be 2-approximation algorithms and more efficient than the existing algorithms

    Sobre modelos de rearranjo de genomas

    Get PDF
    Orientador: JoĂŁo MeidanisTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Rearranjo de genomas Ă© o nome dado a eventos onde grandes blocos de DNA trocam de posição durante o processo evolutivo. Com a crescente disponibilidade de sequĂȘncias completas de DNA, a anĂĄlise desse tipo de eventos pode ser uma importante ferramenta para o entendimento da genĂŽmica evolutiva. VĂĄrios modelos matemĂĄticos de rearranjo de genomas foram propostos ao longo dos Ășltimos vinte anos. Nesta tese, desenvolvemos dois novos modelos. O primeiro foi proposto como uma definição alternativa ao conceito de distĂąncia de breakpoint. Essa distĂąncia Ă© uma das mais simples medidas de rearranjo, mas ainda nĂŁo hĂĄ um consenso quanto Ă  sua definição para o caso de genomas multi-cromossomais. Pevzner e Tesler deram uma definição em 2003 e Tannier et al. a definiram de forma diferente em 2008. Nesta tese, nĂłs desenvolvemos uma outra alternativa, chamada de single-cut-or-join (SCJ). NĂłs mostramos que, no modelo SCJ, alĂ©m da distĂąncia, vĂĄrios problemas clĂĄssicos de rearranjo, como a mediana de rearranjo, genome halving e pequena parcimĂŽnia sĂŁo fĂĄceis, e apresentamos algoritmos polinomiais para eles. O segundo modelo que apresentamos Ă© o formalismo algĂ©brico por adjacĂȘncias, uma extensĂŁo do formalismo algĂ©brico proposto por Meidanis e Dias, que permite a modelagem de cromossomos lineares. Esta era a principal limitação do formalismo original, que sĂł tratava de cromossomos circulares. Apresentamos algoritmos polinomiais para o cĂĄlculo da distĂąncia algĂ©brica e tambĂ©m para encontrar cenĂĄrios de rearranjo entre dois genomas. TambĂ©m mostramos como calcular a distĂąncia algĂ©brica atravĂ©s do grafo de adjacĂȘncias, para facilitar a comparação com outras distĂąncias de rearranjo. Por fim, mostramos como modelar todas as operaçÔes clĂĄssicas de rearranjo de genomas utilizando o formalismo algĂ©bricoAbstract: Genome rearrangements are events where large blocks of DNA exchange places during evolution. With the growing availability of whole genome data, the analysis of these events can be a very important and promising tool for understanding evolutionary genomics. Several mathematical models of genome rearrangement have been proposed in the last 20 years. In this thesis, we propose two new rearrangement models. The first was introduced as an alternative definition of the breakpoint distance. The breakpoint distance is one of the most straightforward genome comparison measures, but when it comes to defining it precisely for multichromosomal genomes, there is more than one way to go about it. Pevzner and Tesler gave a definition in a 2003 paper, and Tannier et al. defined it differently in 2008. In this thesis we provide yet another alternative, calling it single-cut-or-join (SCJ). We show that several genome rearrangement problems, such as genome median, genome halving and small parsimony, become easy for SCJ, and provide polynomial time algorithms for them. The second model we introduce is the Adjacency Algebraic Theory, an extension of the Algebraic Formalism proposed by Meidanis and Dias that allows the modeling of linear chromosomes, the main limitation of the original formalism, which could deal with circular chromosomes only. We believe that the algebraic formalism is an interesting alternative for solving rearrangement problems, with a different perspective that could complement the more commonly used combinatorial graph-theoretic approach. We present polynomial time algorithms to compute the algebraic distance and find rearrangement scenarios between two genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. Finally, we show how all classic rearrangement operations can be modeled using the algebraic theoryDoutoradoCiĂȘncia da ComputaçãoDoutor em CiĂȘncia da Computaçã

    Can a permutation be sorted by best short swaps?

    Get PDF
    A short swap switches two elements with at most one element caught between them. Sorting permutation by short swaps asks to find a shortest short swap sequence to transform a permutation into another. A short swap can eliminate at most three inversions. It is still open for whether a permutation can be sorted by short swaps each of which can eliminate three inversions. In this paper, we present a polynomial time algorithm to solve the problem, which can decide whether a permutation can be sorted by short swaps each of which can eliminate 3 inversions in O(n) time, and if so, sort the permutation by such short swaps in O(n^2) time, where n is the number of elements in the permutation. A short swap can cause the total length of two element vectors to decrease by at most 4. We further propose an algorithm to recognize a permutation which can be sorted by short swaps each of which can cause the element vector length sum to decrease by 4 in O(n) time, and if so, sort the permutation by such short swaps in O(n^2) time. This improves upon the O(n^2) algorithm proposed by Heath and Vergara to decide whether a permutation is so called lucky

    New Algorithms for Structure Informed Genome Rearrangement

    Get PDF

    Estimating true evolutionary distances under the DCJ model

    Get PDF
    Motivation: Modern techniques can yield the ordering and strandedness of genes on each chromosome of a genome; such data already exists for hundreds of organisms. The evolutionary mechanisms through which the set of the genes of an organism is altered and reordered are of great interest to systematists, evolutionary biologists, comparative genomicists and biomedical researchers. Perhaps the most basic concept in this area is that of evolutionary distance between two genomes: under a given model of genomic evolution, how many events most likely took place to account for the difference between the two genomes? Results: We present a method to estimate the true evolutionary distance between two genomes under the ‘double-cut-and-join' (DCJ) model of genome rearrangement, a model under which a single multichromosomal operation accounts for all genomic rearrangement events: inversion, transposition, translocation, block interchange and chromosomal fusion and fission. Our method relies on a simple structural characterization of a genome pair and is both analytically and computationally tractable. We provide analytical results to describe the asymptotic behavior of genomes under the DCJ model, as well as experimental results on a wide variety of genome structures to exemplify the very high accuracy (and low variance) of our estimator. Our results provide a tool for accurate phylogenetic reconstruction from multichromosomal gene rearrangement data as well as a theoretical basis for refinements of the DCJ model to account for biological constraints. Availability: All of our software is available in source form under GPL at http://lcbb.epfl.ch Contact: [email protected]

    Estimating true evolutionary distances under the DCJ model

    Get PDF
    Motivation: Modern techniques can yield the ordering and strandedness of genes on each chromosome of a genome; such data already exists for hundreds of organisms. The evolutionary mechanisms through which the set of the genes of an organism is altered and reordered are of great interest to systematists, evolutionary biologists, comparative genomicists and biomedical researchers. Perhaps the most basic concept in this area is that of evolutionary distance between two genomes: under a given model of genomic evolution, how many events most likely took place to account for the difference between the two genomes
    • 

    corecore