83,078 research outputs found

    An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings

    Full text link
    A widely used method for determining the similarity of two labeled trees is to compute a maximum agreement subtree of the two trees. Previous work on this similarity measure is only concerned with the comparison of labeled trees of two special kinds, namely, uniformly labeled trees (i.e., trees with all their nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled trees with distinct symbols for distinct leaves). This paper presents an algorithm for comparing trees that are labeled in an arbitrary manner. In addition to this generality, this algorithm is faster than the previous algorithms. Another contribution of this paper is on maximum weight bipartite matchings. We show how to speed up the best known matching algorithms when the input graphs are node-unbalanced or weight-unbalanced. Based on these enhancements, we obtain an efficient algorithm for a new matching problem called the hierarchical bipartite matching problem, which is at the core of our maximum agreement subtree algorithm.Comment: To appear in Journal of Algorithm

    Cavity Matchings, Label Compressions, and Unrooted Evolutionary Trees

    Get PDF
    We present an algorithm for computing a maximum agreement subtree of two unrooted evolutionary trees. It takes O(n^{1.5} log n) time for trees with unbounded degrees, matching the best known time complexity for the rooted case. Our algorithm allows the input trees to be mixed trees, i.e., trees that may contain directed and undirected edges at the same time. Our algorithm adopts a recursive strategy exploiting a technique called label compression. The backbone of this technique is an algorithm that computes the maximum weight matchings over many subgraphs of a bipartite graph as fast as it takes to compute a single matching

    Forest Density Estimation

    Full text link
    We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.Comment: Extended version of earlier paper titled "Tree density estimation

    Hierarchical and High-Girth QC LDPC Codes

    Full text link
    We present a general approach to designing capacity-approaching high-girth low-density parity-check (LDPC) codes that are friendly to hardware implementation. Our methodology starts by defining a new class of "hierarchical" quasi-cyclic (HQC) LDPC codes that generalizes the structure of quasi-cyclic (QC) LDPC codes. Whereas the parity check matrices of QC LDPC codes are composed of circulant sub-matrices, those of HQC LDPC codes are composed of a hierarchy of circulant sub-matrices that are in turn constructed from circulant sub-matrices, and so on, through some number of levels. We show how to map any class of codes defined using a protograph into a family of HQC LDPC codes. Next, we present a girth-maximizing algorithm that optimizes the degrees of freedom within the family of codes to yield a high-girth HQC LDPC code. Finally, we discuss how certain characteristics of a code protograph will lead to inevitable short cycles, and show that these short cycles can be eliminated using a "squashing" procedure that results in a high-girth QC LDPC code, although not a hierarchical one. We illustrate our approach with designed examples of girth-10 QC LDPC codes obtained from protographs of one-sided spatially-coupled codes.Comment: Submitted to IEEE Transactions on Information THeor

    Spanning Trees with Many Leaves in Graphs without Diamonds and Blossoms

    Full text link
    It is known that graphs on n vertices with minimum degree at least 3 have spanning trees with at least n/4+2 leaves and that this can be improved to (n+4)/3 for cubic graphs without the diamond K_4-e as a subgraph. We generalize the second result by proving that every graph with minimum degree at least 3, without diamonds and certain subgraphs called blossoms, has a spanning tree with at least (n+4)/3 leaves, and generalize this further by allowing vertices of lower degree. We show that it is necessary to exclude blossoms in order to obtain a bound of the form n/3+c. We use the new bound to obtain a simple FPT algorithm, which decides in O(m)+O^*(6.75^k) time whether a graph of size m has a spanning tree with at least k leaves. This improves the best known time complexity for MAX LEAF SPANNING TREE.Comment: 25 pages, 27 Figure
    • …
    corecore