64 research outputs found

    A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

    Get PDF
    Recently, Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph to describe the relationship between the genomes of many individuals/strains of the same or closely related species. They devised an O(nlogg)O(n \log g) time algorithm called splitMEM that constructs this graph directly (i.e., without using the uncompressed de Bruijn graph) based on a suffix tree, where nn is the total length of the genomes and gg is the length of the longest genome. In this paper, we present a construction algorithm that outperforms their algorithm in theory and in practice. Moreover, we propose a new space-efficient representation of the compressed de Bruijn graph that adds the possibility to search for a pattern (e.g. an allele - a variant form of a gene) within the pan-genome.Comment: Submitted to Algorithmica special issue of CPM201

    Edge minimization in de Bruijn graphs

    Full text link
    This paper introduces the de Bruijn graph edge minimization problem, which is related to the compression of de Bruijn graphs: find the order-k de Bruijn graph with minimum edge count among all orders. We describe an efficient algorithm that solves this problem. Since the edge minimization problem is connected to the BWT compression technique called "tunneling", the paper also describes a way to minimize the length of a tunneled BWT in such a way that useful properties for sequence analysis are preserved. Although being a restriction, this is significant progress towards a solution to the open problem of finding optimal disjoint blocks that minimize space, as stated in Alanko et al. (DCC 2019).Comment: Accepted for Data Compression Conference 202

    A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.</p> <p>Results</p> <p>In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.</p> <p>Conclusion</p> <p>The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.</p

    Implementing conditional term rewriting by graph rewriting

    Get PDF
    AbstractFor reasons of efficiency, term rewriting is usually implemented by graph rewriting. Barendregt et al. showed that graph rewriting is a sound and complete implementation of (almost) orthogonal term rewriting systems. Their result was strengthened by Kennaway et al. who showed that graph rewriting is adequate for simulating term rewriting. In this paper, we extend the aforementioned results to a class of conditional term rewriting systems which plays a key role in the integration of functional and logic programming. In these systems extra variables are allowed in conditions and right-hand sides of rules. Moreover, it is shown that orthogonal conditional rules give rise to a subcommutative conditional graph rewrite relation. This implies that the graph rewrite relation is level-confluent

    Modular properties of composable term rewriting systems

    Get PDF
    Ohlebusch E. Modular properties of composable term rewriting systems. Forschungsberichte der Technischen Fakultät, Abteilung Informationstechnik / Universität Bielefeld ; 94-01. Bielefeld; 1994
    corecore