424 research outputs found

    Sobre modelos de rearranjo de genomas

    Get PDF
    Orientador: João MeidanisTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Rearranjo de genomas é o nome dado a eventos onde grandes blocos de DNA trocam de posição durante o processo evolutivo. Com a crescente disponibilidade de sequências completas de DNA, a análise desse tipo de eventos pode ser uma importante ferramenta para o entendimento da genômica evolutiva. Vários modelos matemáticos de rearranjo de genomas foram propostos ao longo dos últimos vinte anos. Nesta tese, desenvolvemos dois novos modelos. O primeiro foi proposto como uma definição alternativa ao conceito de distância de breakpoint. Essa distância é uma das mais simples medidas de rearranjo, mas ainda não há um consenso quanto à sua definição para o caso de genomas multi-cromossomais. Pevzner e Tesler deram uma definição em 2003 e Tannier et al. a definiram de forma diferente em 2008. Nesta tese, nós desenvolvemos uma outra alternativa, chamada de single-cut-or-join (SCJ). Nós mostramos que, no modelo SCJ, além da distância, vários problemas clássicos de rearranjo, como a mediana de rearranjo, genome halving e pequena parcimônia são fáceis, e apresentamos algoritmos polinomiais para eles. O segundo modelo que apresentamos é o formalismo algébrico por adjacências, uma extensão do formalismo algébrico proposto por Meidanis e Dias, que permite a modelagem de cromossomos lineares. Esta era a principal limitação do formalismo original, que só tratava de cromossomos circulares. Apresentamos algoritmos polinomiais para o cálculo da distância algébrica e também para encontrar cenários de rearranjo entre dois genomas. Também mostramos como calcular a distância algébrica através do grafo de adjacências, para facilitar a comparação com outras distâncias de rearranjo. Por fim, mostramos como modelar todas as operações clássicas de rearranjo de genomas utilizando o formalismo algébricoAbstract: Genome rearrangements are events where large blocks of DNA exchange places during evolution. With the growing availability of whole genome data, the analysis of these events can be a very important and promising tool for understanding evolutionary genomics. Several mathematical models of genome rearrangement have been proposed in the last 20 years. In this thesis, we propose two new rearrangement models. The first was introduced as an alternative definition of the breakpoint distance. The breakpoint distance is one of the most straightforward genome comparison measures, but when it comes to defining it precisely for multichromosomal genomes, there is more than one way to go about it. Pevzner and Tesler gave a definition in a 2003 paper, and Tannier et al. defined it differently in 2008. In this thesis we provide yet another alternative, calling it single-cut-or-join (SCJ). We show that several genome rearrangement problems, such as genome median, genome halving and small parsimony, become easy for SCJ, and provide polynomial time algorithms for them. The second model we introduce is the Adjacency Algebraic Theory, an extension of the Algebraic Formalism proposed by Meidanis and Dias that allows the modeling of linear chromosomes, the main limitation of the original formalism, which could deal with circular chromosomes only. We believe that the algebraic formalism is an interesting alternative for solving rearrangement problems, with a different perspective that could complement the more commonly used combinatorial graph-theoretic approach. We present polynomial time algorithms to compute the algebraic distance and find rearrangement scenarios between two genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. Finally, we show how all classic rearrangement operations can be modeled using the algebraic theoryDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

    On Weighting Schemes for Gene Order Analysis

    Get PDF
    Gene order analysis aims at extracting phylogenetic information from the comparison of the order and orientation of the genes on the genomes of different species. This can be achieved by computing parsimonious rearrangement scenarios, i.e. to determine a sequence of rearrangements events that transforms one given gene order into another such that the sum of weights of the included rearrangement events is minimal. In this sequence only certain types of rearrangements, given by the rearrangement model, are admissible and weights are assigned with respect to the rearrangement type. The choice of a suitable rearrangement model and corresponding weights for the included rearrangement types is important for the meaningful reconstruction. So far the analysis of weighting schemes for gene order analysis has not been considered sufficiently. In this paper weighting schemes for gene order analysis are considered for two rearrangement models: 1) inversions, transpositions, and inverse transpositions; 2) inversions, block interchanges, and inverse transpositions. For both rearrangement models we determined properties of the weighting functions that exclude certain types of rearrangements from parsimonious rearrangement scenarios

    Rearranjo de genomas : uma coletanea de artigos

    Get PDF
    Orientador : João MeidanisTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Hoje em dia, estão disponíveis, publicamente, uma imensa quantidade de informações genéticas. O desafio atual da Genômica é processar estes dados de forma a obter conclusões biológicas relevantes. Uma das maneiras de estruturar estas informações é através de comparação de genomas, que busca semelhanças e diferenças entre os genomas de dois ou mais organismos. Neste contexto, a área de Rearranjo de Genomas vem recebendo bastante atenção ultimamente. Uma forma de comparar genomas é através da distância de rearranjo, determinada pelo número mínimo de eventos de rearranjo que podem explicar as diferenças entre dois genomas. Os principais estudos em distância de rearranjo envolvem eventos de reversões e transposições. A presente coletânea é composta de oito artigos, contendo vários resultados importantes sobre Rearranjo de Genomas. Estes trabalhos foram apresentados em seis conferências, sendo uma nacional e cinco internacionais. Dois destes trabalhos serão publicados em importantes revistas internacionais e outro foi incluído como um capítulo de um livro. Nossas principais contribuições podem ser divididas em dois grupos: um novo formalismo algébrico e uma série de resultados envolvendo o evento de transposição. A nova teoria algébrica relaciona a teoria de Rearranjo de Genomas com a de grupos de permutações. Nossa intenção foi estabelecer um formalismo algébrico que simplificasse a obtenção de novos resultados, até hoje, muito baseados na construção de diagramas. Estudamos o evento de transposição de várias formas. Além de apresentarmos resultados sobre a distância de transposição entre uma permutação e sua inversa, também estudamos o problema de rearranjo envolvendo transposições e reversões simultaneamente, construindo algoritmos de aproximação e estabelecendo uma conjectura sobre o diâmetro. Usamos o formalismo algébrico para mostrar que é possível determinar a distância de fusão, fissão e transposição em tempo polinomial. Este é o primeiro resultado polinomial conhecido para um problema de rearranjo envolvendo o evento de transposição. Por último, introduzimos dois novos problemas de rearranjo: o problema de distância sintênica envolvendo fusões e fissões, e o problema de transposição de prefixos. Para ambos apresentamos resultados significativos, que avançam o conhecimento na áreaAbstract: Nowadays, a huge amount of genetic information is public1y available. Genomic's current challenge is to process this information in order to obtain relevant biological conc1usions. One possible way of structuring this information is through genome comparison, where we seek similarities and differences among the genomes of two or more organisms. In this context, the area of Genome Rearrangements has received considerable attention lately. One way of comparing genomes is given by the rearrangement distance, which is determined by the minimum number of rearrangement events that explain the differences between two genomes. The main studies in rearrangement distance involve reversal and transposition events. The present collection is composed of eight artic1es, containing several important results on Genome Rearrangements. These papers were presented in six conferences, one with Brazilian scope and five with international scope. Two of these works will be published in important international journals, and one other work appeared as a book chapter. Our main contributions can be divided into two groups: a new algebraic formalism and a series of results involving the transposition event. The new algebraic theory relates the genome rearrangement theory to the theory of permutation groups. Our intention was to establish an algebraic formalism that simplifies the creation of new results, up to now excessively based on the construction of diagrams. We studied the transposition event in several ways. Besides presenting results on the transpositions distance between a permutation and its inverse, we also studied the rearrangement problem involving transpositions and reversals simultaneously, constructing approximation algorithms and proposing a conjecture on the diameter. We used the algebraic formalism to show that it is possible to determine the distance of fusion, fission, and transposition in polynomial time. This is the first polynomial time result for a rearrangement problem involving the transposition event. Finally, we introduced two now rearrangement problems: the syntenic distance problem involving fission and fusion, and the prefix transposition problem. For each one of these problems we present significant results, widening the knowledge in this areaDoutoradoDoutor em Ciência da Computaçã

    Phylogenetic reconstruction from transpositions

    Get PDF
    Background Because of the advent of high-throughput sequencing and the consequent reduction in the cost of sequencing, many organisms have been completely sequenced and most of their genes identified. It thus has become possible to represent whole genomes as ordered lists of gene identifiers and to study the rearrangement of these entities through computational means. As a result, genome rearrangement data has attracted increasing attentions from both biologists and computer scientists as a new type of data for phylogenetic analysis. The main events of genome rearrangements include inversions, transpositions and transversions. To date, GRAPPA and MGR are the most accurate methods for rearrangement phylogeny, both assuming inversion as the only event. However, due to the complexity of computing transposition distance, it is very difficult to analyze datasets when transpositions are dominant. Results We extend GRAPPA to handle transpositions. The new method is named GRAPPA-TP, with two major extensions: a heuristic method to estimate transposition distance, and a new transposition median solver for three genomes. Although GRAPPA-TP uses a greedy approach to compute the transposition distance, it is very accurate when genomes are relatively close. The new GRAPPA-TP is available from http://phylo.cse.sc.edu/ Conclusion Our extensive testing using simulated datasets shows that GRAPPA-TP is very accurate in terms of ancestor genome inference and phylogenetic reconstruction. Simulation results also suggest that model match is critical in genome rearrangement analysis: it is not accurate to simulate transpositions with other events including inversions

    Sorting permutations by cut-circularize-linearize-and-paste operations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome rearrangements are studied on the basis of genome-wide analysis of gene orders and important in the evolution of species. In the last two decades, a variety of rearrangement operations, such as reversals, transpositions, block-interchanges, translocations, fusions and fissions, have been proposed to evaluate the differences between gene orders in two or more genomes. Usually, the computational studies of genome rearrangements are formulated as problems of sorting permutations by rearrangement operations.</p> <p>Result</p> <p>In this article, we study a sorting problem by cut-circularize-linearize-and-paste (CCLP) operations, which aims to find a minimum number of CCLP operations to sort a signed permutation representing a chromosome. The CCLP is a genome rearrangement operation that cuts a segment out of a chromosome, circularizes the segment into a temporary circle, linearizes the temporary circle as a linear segment, and possibly inverts the linearized segment and pastes it into the remaining chromosome. The CCLP operation can model many well-known rearrangements, such as reversals, transpositions and block-interchanges, and others not reported in the biological literature. In addition, it really occurs in the immune response of higher animals. To distinguish those CCLP operations from the reversal, we call them as non-reversal CCLP operations. In this study, we use permutation groups in algebra to design an <it>O</it>(<it>δn</it>) time algorithm for solving the weighted sorting problem by CCLP operations when the weight ratio between reversals and non-reversal CCLP operations is 1:2, where <it>n</it> is the number of genes in the given chromosome and <it>δ</it> is the number of needed CCLP operations.</p> <p>Conclusion</p> <p>The algorithm we propose in this study is very simple so that it can be easily implemented with 1-dimensional arrays and useful in the studies of phylogenetic tree reconstruction and human immune response to tumors.</p

    Algorithmic approaches for genome rearrangement: a review

    Full text link

    Are There Rearrangement Hotspots in the Human Genome?

    Get PDF
    In a landmark paper, Nadeau and Taylor [18] formulated the random breakage model (RBM) of chromosome evolution that postulates that there are no rearrangement hotspots in the human genome. In the next two decades, numerous studies with progressively increasing levels of resolution made RBM the de facto theory of chromosome evolution. Despite the fact that RBM had prophetic prediction power, it was recently refuted by Pevzner and Tesler [4], who introduced the fragile breakage model (FBM), postulating that the human genome is a mosaic of solid regions (with low propensity for rearrangements) and fragile regions (rearrangement hotspots). However, the rebuttal of RBM caused a controversy and led to a split among researchers studying genome evolution. In particular, it remains unclear whether some complex rearrangements (e.g., transpositions) can create an appearance of rearrangement hotspots. We contribute to the ongoing debate by analyzing multi-break rearrangements that break a genome into multiple fragments and further glue them together in a new order. In particular, we demonstrate that (1) even if transpositions were a dominant force in mammalian evolution, the arguments in favor of FBM still stand, and (2) the ‘‘gene deletion’’ argument against FBM is flawed
    corecore