2,795 research outputs found

    BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction

    Get PDF
    A novel discrete mathematical approach is proposed as an additional tool for molecular systematics which does not require prior statistical assumptions concerning the evolutionary process. The method is based on algorithms generating mathematical representations directly from DNA/RNA or protein sequences, followed by the output of numerical (scalar or vector) and visual characteristics (graphs). The binary encoded sequence information is transformed into a compact analytical form, called the Iterative Canonical Form (or ICF) of Boolean functions, which can then be used as a generalized molecular descriptor. The method provides raw vector data for calculating different distance matrices, which in turn can be analyzed by neighbor-joining or UPGMA to derive a phylogenetic tree, or by principal coordinates analysis to get an ordination scattergram. The new method and the associated software for inferring phylogenetic trees are called the Boolean analysis or BOOL-AN

    Computational Complexity of the Interleaving Distance

    Full text link
    The interleaving distance is arguably the most prominent distance measure in topological data analysis. In this paper, we provide bounds on the computational complexity of determining the interleaving distance in several settings. We show that the interleaving distance is NP-hard to compute for persistence modules valued in the category of vector spaces. In the specific setting of multidimensional persistent homology we show that the problem is at least as hard as a matrix invertibility problem. Furthermore, this allows us to conclude that the interleaving distance of interval decomposable modules depends on the characteristic of the field. Persistence modules valued in the category of sets are also studied. As a corollary, we obtain that the isomorphism problem for Reeb graphs is graph isomorphism complete.Comment: Discussion related to the characteristic of the field added. Paper accepted to the 34th International Symposium on Computational Geometr

    Persistent Homology Over Directed Acyclic Graphs

    Full text link
    We define persistent homology groups over any set of spaces which have inclusions defined so that the corresponding directed graph between the spaces is acyclic, as well as along any subgraph of this directed graph. This method simultaneously generalizes standard persistent homology, zigzag persistence and multidimensional persistence to arbitrary directed acyclic graphs, and it also allows the study of more general families of topological spaces or point-cloud data. We give an algorithm to compute the persistent homology groups simultaneously for all subgraphs which contain a single source and a single sink in O(n4)O(n^4) arithmetic operations, where nn is the number of vertices in the graph. We then demonstrate as an application of these tools a method to overlay two distinct filtrations of the same underlying space, which allows us to detect the most significant barcodes using considerably fewer points than standard persistence.Comment: Revised versio

    An approximate version of Sidorenko's conjecture

    Get PDF
    A beautiful conjecture of Erd\H{o}s-Simonovits and Sidorenko states that if H is a bipartite graph, then the random graph with edge density p has in expectation asymptotically the minimum number of copies of H over all graphs of the same order and edge density. This conjecture also has an equivalent analytic form and has connections to a broad range of topics, such as matrix theory, Markov chains, graph limits, and quasirandomness. Here we prove the conjecture if H has a vertex complete to the other part, and deduce an approximate version of the conjecture for all H. Furthermore, for a large class of bipartite graphs, we prove a stronger stability result which answers a question of Chung, Graham, and Wilson on quasirandomness for these graphs.Comment: 12 page
    corecore