Search CORE

255 research outputs found

Sorting signed permutations by reversals, revisited

Author: Kaplan Haim
Verbin Elad
Publication venue: Elsevier Inc.
Publication date: 31/05/2005
Field of study

AbstractThe problem of sorting signed permutations by reversals (SBR) is a fundamental problem in computational molecular biology. The goal is, given a signed permutation, to find a shortest sequence of reversals that transforms it into the positive identity permutation, where a reversal is the operation of taking a segment of the permutation, reversing it, and flipping the signs of its elements.In this paper we describe a randomized algorithm for SBR. The algorithm tries to sort the permutation by repeatedly performing a random oriented reversal. This process is in fact a random walk on the graph where permutations are the nodes and an arc from π to π′ corresponds to an oriented reversal that transforms π to π′. We show that if this random walk stops at the identity permutation, then we have found a shortest sequence. We give empirical evidence that this process indeed succeeds with high probability on a random permutation.To implement our algorithm we describe a data structure to maintain a permutation, that allows to draw an oriented reversal uniformly at random, and perform it in sub-linear time. With this data structure we can implement the random walk in O(n3/2logn) time, thus obtaining an algorithm for SBR that almost always runs in sub-quadratic time. The data structures we present may also be of independent interest for developing other algorithms for SBR, and for other problems.Finally, we present the first efficient parallel algorithm for SBR. We obtain this result by developing a fast implementation of the recent algorithm of Bergeron (Proceedings of CPM, 2001, pp. 106–117) for sorting signed permutations by reversals that is parallelizable. Our implementation runs in O(n2logn) time on a regular RAM, and in O(nlogn) time on a PRAM using n processors

Elsevier - Publisher Connector

Sobre modelos de rearranjo de genomas

Author: Feijão Pedro Cipriano, 1975-
Publication venue: [s.n.]
Publication date: 21/08/2018
Field of study

Orientador: João MeidanisTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Rearranjo de genomas é o nome dado a eventos onde grandes blocos de DNA trocam de posição durante o processo evolutivo. Com a crescente disponibilidade de sequências completas de DNA, a análise desse tipo de eventos pode ser uma importante ferramenta para o entendimento da genômica evolutiva. Vários modelos matemáticos de rearranjo de genomas foram propostos ao longo dos últimos vinte anos. Nesta tese, desenvolvemos dois novos modelos. O primeiro foi proposto como uma definição alternativa ao conceito de distância de breakpoint. Essa distância é uma das mais simples medidas de rearranjo, mas ainda não há um consenso quanto à sua definição para o caso de genomas multi-cromossomais. Pevzner e Tesler deram uma definição em 2003 e Tannier et al. a definiram de forma diferente em 2008. Nesta tese, nós desenvolvemos uma outra alternativa, chamada de single-cut-or-join (SCJ). Nós mostramos que, no modelo SCJ, além da distância, vários problemas clássicos de rearranjo, como a mediana de rearranjo, genome halving e pequena parcimônia são fáceis, e apresentamos algoritmos polinomiais para eles. O segundo modelo que apresentamos é o formalismo algébrico por adjacências, uma extensão do formalismo algébrico proposto por Meidanis e Dias, que permite a modelagem de cromossomos lineares. Esta era a principal limitação do formalismo original, que só tratava de cromossomos circulares. Apresentamos algoritmos polinomiais para o cálculo da distância algébrica e também para encontrar cenários de rearranjo entre dois genomas. Também mostramos como calcular a distância algébrica através do grafo de adjacências, para facilitar a comparação com outras distâncias de rearranjo. Por fim, mostramos como modelar todas as operações clássicas de rearranjo de genomas utilizando o formalismo algébricoAbstract: Genome rearrangements are events where large blocks of DNA exchange places during evolution. With the growing availability of whole genome data, the analysis of these events can be a very important and promising tool for understanding evolutionary genomics. Several mathematical models of genome rearrangement have been proposed in the last 20 years. In this thesis, we propose two new rearrangement models. The first was introduced as an alternative definition of the breakpoint distance. The breakpoint distance is one of the most straightforward genome comparison measures, but when it comes to defining it precisely for multichromosomal genomes, there is more than one way to go about it. Pevzner and Tesler gave a definition in a 2003 paper, and Tannier et al. defined it differently in 2008. In this thesis we provide yet another alternative, calling it single-cut-or-join (SCJ). We show that several genome rearrangement problems, such as genome median, genome halving and small parsimony, become easy for SCJ, and provide polynomial time algorithms for them. The second model we introduce is the Adjacency Algebraic Theory, an extension of the Algebraic Formalism proposed by Meidanis and Dias that allows the modeling of linear chromosomes, the main limitation of the original formalism, which could deal with circular chromosomes only. We believe that the algebraic formalism is an interesting alternative for solving rearrangement problems, with a different perspective that could complement the more commonly used combinatorial graph-theoretic approach. We present polynomial time algorithms to compute the algebraic distance and find rearrangement scenarios between two genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. Finally, we show how all classic rearrangement operations can be modeled using the algebraic theoryDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

Sorting by reversals, block interchanges, tandem duplications, and deletions

Author: D Bader
D Bertrand
D Christie
D Sankoff
D Sankoff
E Tannier
G Blanc
H Nagamochi
I Elias
J Mixtacki
K Swenson
M Marron
M Ozery-Flato
Martin Bader
N El-Mabrouk
N El-Mabrouk
N El-Mabrouk
N El-Mabrouk
R Warren
S Hannenhalli
S Yancopoulos
S Yancopoulos
T Hartman
T Hartman
V Bafna
X Chen
Y Han
Z Fu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Finding sequences of evolutionary operations that transform one genome into another is a classic problem in comparative genomics. While most of the genome rearrangement algorithms assume that there is exactly one copy of each gene in both genomes, this does not reflect the biological reality very well – most of the studied genomes contain duplicated gene content, which has to be removed before applying those algorithms. However, dealing with unequal gene content is a very challenging task, and only few algorithms allow operations like duplications and deletions. Almost all of these algorithms restrict these operations to have a fixed size. Results In this paper, we present a heuristic algorithm to sort an ancestral genome (with unique gene content) into a genome of a descendant (with arbitrary gene content) by reversals, block interchanges, tandem duplications, and deletions, where tandem duplications and deletions are of arbitrary size. Conclusion Experimental results show that our algorithm finds sorting sequences that are close to an optimal sorting sequence when the ancestor and the descendant are closely related. The quality of the results decreases when the genomes get more diverged or the genome size increases. Nevertheless, the calculated distances give a good approximation of the true evolutionary distances.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

On the Inversion-Indel Distance

Author: Dias Vieira Braga Marília
Stoye Jens
Willing Eyla
Zaccaria Simone
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Willing E, Zaccaria S, Dias Vieira Braga M, Stoye J. On the Inversion-Indel Distance. BMC Bioinformatics. 2013;14(Suppl 15: Proc. of RECOMB-CG 2013): S3.Background The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be computed thanks to a pioneering approach of Hannenhalli and Pevzner in 1995. In 2000, El-Mabrouk extended the inversion model to allow the comparison of unichromosomal genomes with unequal contents, thus insertions and deletions of DNA segments besides inversions. However, an exact algorithm was presented only for the case in which we have insertions alone and no deletion (or vice versa), while a heuristic was provided for the symmetric case, that allows both insertions and deletions and is called the inversion-indel distance. In 2005, Yancopoulos, Attie and Friedberg started a new branch of research by introducing the generic double cut and join (DCJ) operation, that can represent several genome rearrangements (including inversions). Among others, the DCJ model gave rise to two important results. First, it has been shown that the inversion distance can be computed in a simpler way with the help of the DCJ operation. Second, the DCJ operation originated the DCJ-indel distance, that allows the comparison of genomes with unequal contents, considering DCJ, insertions and deletions, and can be computed in linear time. Results In the present work we put these two results together to solve an open problem, showing that, when the graph that represents the relation between the two compared genomes has no bad components, the inversion-indel distance is equal to the DCJ-indel distance. We also give a lower and an upper bound for the inversion-indel distance in the presence of bad components

Crossref

Springer - Publisher Connector

Publications at Bielefeld University

Reconstructing the Genomic Architecture of Mammalian Ancestors Using Multispecies Comparative Maps

Author: Bourque Guillaume
Murphy William J.
O\u27Brien Stephen J.
Pevzner Pavel
Tesler Glenn
Publication venue: NSUWorks
Publication date: 01/11/2003
Field of study

Rapidly developing comparative gene maps in selected mammal species are providing an opportunity to reconstruct the genomic architecture of mammalian ancestors and study rearrangements that transformed this ancestral genome into existing mammalian genomes. Here, the recently developed Multiple Genome Rearrangement (MGR) algorithm is applied to human, mouse, cat and cattle comparative maps (with 311-470 shared markers) to impute the ancestral mammalian genome. Reconstructed ancestors consist of 70-100 conserved segments shared across the genomes that have been exchanged by rearrangement events along the ordinal lineages leading to modern species genomes. Genomic distances between species, dominated by inversions (reversals) and translocations, are presented in a first multispecies attempt using ordered mapping data to reconstruct the evolutionary exchanges that preceded modern placental mammal genomes

PubMed Central

NSU Works

A Phylogenomic Study of Human, Dog, and Mouse

Author: Adrian Schneider
Gaston Gonnet
Gina Cannarozzi
International Human Genome Consortium
Mouse Genome Sequencing Consortium
Philip E Bourne
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

In recent years the phylogenetic relationship of mammalian orders has been addressed in a number of molecular studies. These analyses have frequently yielded inconsistent results with respect to some basal ordinal relationships. For example, the relative placement of primates, rodents, and carnivores has differed in various studies. Here, we attempt to resolve this phylogenetic problem by using data from completely sequenced nuclear genomes to base the analyses on the largest possible amount of data. To minimize the risk of reconstruction artifacts, the trees were reconstructed under different criteria—distance, parsimony, and likelihood. For the distance trees, distance metrics that measure independent phenomena (amino acid replacement, synonymous substitution, and gene reordering) were used, as it is highly improbable that all of the trees would be affected the same way by any reconstruction artifact. In contradiction to the currently favored classification, our results based on full-genome analysis of the phylogenetic relationship between human, dog, and mouse yielded overwhelming support for a primate–carnivore clade with the exclusion of rodents

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

Applications of heuristic search on phylogeny reconstruction problems

Author: Mutluergil Suha Orhun
Mutluergil Süha Orhun
Publication venue
Publication date: 01/01/2012
Field of study

Phylogenies or evolutionary trees for a given family of species show the evolutionary relationships between these species. The leaves denote the given species, the internal nodes denote their common ancestors and the edges denote the genetic relationships. Species can be identified by their whole genomes and the evolutionary relations between species can be measured by the number of rearrangement events (i.e. mutations) that transform one genome into another. One approach to infer phylogeny from genomic data is by solving median genome problems for three genomes, or the genome rearrangement problem for pairs of genomes, while trying to minimize the total evolutionary distance among the given species. In this thesis, we have developed and implemented two search based algorithms for phylogeny reconstruction problem based on solving median genome problems for circular genomes of the same length without gene duplication. In order to show applicability and effectiveness of our algorithms, we have tested them with randomly generated instances and two real data sets: mitochondrial genomes of Metazoa and chloroplast genomes of Campanulaceae

Sabanci University Research Database