    Computing Persistent Homology within Coq/SSReflect

    Persistent homology is one of the most active branches of Computational Algebraic Topology with applications in several contexts such as optical character recognition or analysis of point cloud data. In this paper, we report on the formal development of certified programs to compute persistent Betti numbers, an instrumental tool of persistent homology, using the Coq proof assistant together with the SSReflect extension. To this aim it has been necessary to formalize the underlying mathematical theory of these algorithms. This is another example showing that interactive theorem provers have reached a point where they are mature enough to tackle the formalization of nontrivial mathematical theories

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    Computing Multipersistence by Means of Spectral Systems

    In their original setting, both spectral sequences and persistent homology are algebraic topology tools defined from filtrations of objects (e.g. topological spaces or simplicial complexes) indexed over the set Z of integer numbers. Recently, generalizations of both concepts have been proposed which originate from a different choice of the set of indices of the filtration, producing the new notions of multipersistence and spectral system. In this paper, we show that these notions are related, generalizing results valid in the case of filtrations over Z. By using this relation and some previous programs for computing spectral systems, we have developed a new module for the Kenzo system computing multipersistence. We also present a new invariant providing information on multifiltrations and applications of our algorithms to spaces of infinite type


    Given a locally finite X ⊆ ℝd and a radius r ≥ 0, the k-fold cover of X and r consists of all points in ℝd that have k or more points of X within distance r. We consider two filtrations - one in scale obtained by fixing k and increasing r, and the other in depth obtained by fixing r and decreasing k - and we compute the persistence diagrams of both. While standard methods suffice for the filtration in scale, we need novel geometric and topological concepts for the filtration in depth. In particular, we introduce a rhomboid tiling in ℝd+1 whose horizontal integer slices are the order-k Delaunay mosaics of X, and construct a zigzag module from Delaunay mosaics that is isomorphic to the persistence module of the multi-covers