8 research outputs found

    Analysis of local genome rearrangement improves resolution of ancestral genomic maps in plants

    Get PDF
    Rubert D, Martinez FHV, Stoye J, Dörr D. Analysis of local genome rearrangement improves resolution of ancestral genomic maps in plants. BMC Genomics. 2020;21(Suppl. 2): 273.Background Computationally inferred ancestral genomes play an important role in many areas of genome research. We present an improved workflow for the reconstruction from highly diverged genomes such as those of plants. Results Our work relies on an established workflow in the reconstruction of ancestral plants, but improves several steps of this process. Instead of using gene annotations for inferring the genome content of the ancestral sequence, we identify genomic markers through a process called genome segmentation. This enables us to reconstruct the ancestral genome from hundreds of thousands of markers rather than the tens of thousands of annotated genes. We also introduce the concept of local genome rearrangement, through which we refine syntenic blocks before they are used in the reconstruction of contiguous ancestral regions. With the enhanced workflow at hand, we reconstruct the ancestral genome of eudicots, a major sub-clade of flowering plants, using whole genome sequences of five modern plants. Conclusions Our reconstructed genome is highly detailed, yet its layout agrees well with that reported in Badouin et al. (2017). Using local genome rearrangement, not only the marker-based, but also the gene-based reconstruction of the eudicot ancestor exhibited increased genome content, evidencing the power of this novel concept

    On the family-free DCJ distance and similarity

    Get PDF
    Viduani Martinez FH, Feijão P, Dias Vieira Braga M, Stoye J. On the family-free DCJ distance and similarity. Algorithms for Molecular Biology. 2015;10(1): 13.Structural variation in genomes can be revealed by many (dis)similarity measures. Rearrangement operations, such as the so called double-cut-and-join (DCJ), are large-scale mutations that can create complex changes and produce such variations in genomes. A basic task in comparative genomics is to find the rearrangement distance between two given genomes, i.e., the minimum number of rearragement operations that transform one given genome into another one. In a family-based setting, genes are grouped into gene families and efficient algorithms have already been presented to compute the DCJ distance between two given genomes. In this work we propose the problem of computing the DCJ distance of two given genomes without prior gene family assignment, directly using the pairwise similarities between genes. We prove that this new family-free DCJ distance problem is APX-hard and provide an integer linear program to its solution. We also study a family-free DCJ similarity and prove that its computation is NP-hard

    Approximating the DCJ distance of balanced genomes in linear time

    Get PDF
    Rubert D, Feijão P, Dias Vieira Braga M, Stoye J, Martinez FHV. Approximating the DCJ distance of balanced genomes in linear time. Algorithms for Molecular Biology. 2017;12(1): 3.Background Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard. Results We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem. Conclusions Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions

    Computing the family-free DCJ similarity

    Get PDF
    Rubert D, Hoshino EA, Dias Vieira Braga M, Stoye J, Martinez FHV. Computing the family-free DCJ similarity. BMC Bioinformatics. 2018;19(Suppl. 6): 152.Background The genomic similarity is a large-scale measure for comparing two given genomes. In this work we study the (NP-hard) problem of computing the genomic similarity under the DCJ model in a setting that does not assume that the genes of the compared genomes are grouped into gene families. This problem is called family-free DCJ similarity. Results We propose an exact ILP algorithm to solve the family-free DCJ similarity problem, then we show its APX-hardness and present four combinatorial heuristics with computational experiments comparing their results to the ILP. Conclusions We show that the family-free DCJ similarity can be computed in reasonable time, although for larger genomes it is necessary to resort to heuristics. This provides a basis for further studies on the applicability and model refinement of family-free whole genome similarity measures

    not available

    No full text
    No problema de Steiner em grafos é dado um grafo completo com custos nas arestas e um subconjunto de vértices chamados terminais e queremos encontrar uma árvore de menor custo que conecte todos os terminais. Este trabalho aborda restrições desse problema. Os problemas abordados têm aplicações em construção de árvores filogenéticas em biologia, roteamento local ou global no projeto de placas VLSI, transporte e telecomunicações, bem como são úteis para se estabelecer a complexidade computacional para os problemas sem restrições. O primeiro problema abordado é o da árvore de Steiner de terminais folhas, onde exigimos que os terminais sejam folhas na árvore resultante. O segundo problema é o da árvore de Steiner de terminais folhas com custos 1 ou 2 nas arestas. Apresentamos algoritmos de aproximação que melhoram as razões de aproximação previamente conhecidas para esses problemas. Propomos também uma nova variante do problema, na qual uma permutação dos vértices terminais também é dada como entrada. Queremos encontrar agora uma árvore de Steiner de menor custo que respeite a permutação dada. Dizemos que a árvore respeita a permutação se sempre que terminais 'r IND.1', 'r IND. 2', 'r IND. 3'e 'r IND. 4' aparecem nesta ordem na permutação, os caminhos na árvore entre 'r IND. 1' a 'r IND. 3' e entre 'r IND. 2'a 'r IND. 4' têm pelo menos um vértice em comum. Mostramos que árvores k-restritas são aproximações para esse problema na mesma razão que o são em geral para o problema de Steiner em grafos. E mostramos um algoritmo que encontra em tempo polinomial uma árvore k-restrita ótima para esta versão do problema.not availabl

    On the family-free DCJ distance

    No full text
    Viduani Martinez FH, Feijão P, Dias Vieira Braga M, Stoye J. On the family-free DCJ distance. In: Brown D, Morgenstern B, eds. Algorithms in Bioinformatics. WABI 2014. Lecture Notes in Bioinformatics. Vol 8701. Berlin ; Heidelberg: Springer Verlag; 2014: 174-186

    A Linear Time Approximation Algorithm for the DCJ Distance for Genomes with Bounded Number of Duplicates

    No full text
    Rubert D, Feijão P, Dias Vieira Braga M, Stoye J, Martinez FHV. A Linear Time Approximation Algorithm for the DCJ Distance for Genomes with Bounded Number of Duplicates. In: Frith M, Storm Pedersen CN, eds. Algorithms in Bioinformatics. WABI 2016. Lecture Notes in Bioinformatics. Vol 9838. Cham: Springer; 2016: 293-306

    Algorithms for Computing the Family-Free Genomic Similarity under DCJ

    No full text
    Rubert D, Medeiros GL, Hoshino EA, Dias Vieira Braga M, Stoye J, Martinez FHV. Algorithms for Computing the Family-Free Genomic Similarity under DCJ. In: Meidanis J, Nakhleh L, eds. Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Bioinformatics. Vol 10562. Cham: Springer Verlag; 2017: 76-100
    corecore