23 research outputs found

    Hotspot identification for Mapper graphs

    Full text link
    Mapper algorithm can be used to build graph-based representations of high-dimensional data capturing structurally interesting features such as loops, flares or clusters. The graph can be further annotated with additional colouring of vertices allowing location of regions of special interest. For instance, in many applications, such as precision medicine, Mapper graph has been used to identify unknown compactly localized subareas within the dataset demonstrating unique or unusual behaviours. This task, performed so far by a researcher, can be automatized using hotspot analysis. In this work we propose a new algorithm for detecting hotspots in Mapper graphs. It allows automatizing of the hotspot detection process. We demonstrate the performance of the algorithm on a number of artificial and real world datasets. We further demonstrate how our algorithm can be used for the automatic selection of the Mapper lens functions.Comment: Topological Data Analysis and Beyond Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020

    Sheaf-Theoretic Stratification Learning from Geometric and Topological Perspectives

    Get PDF
    In this paper, we investigate a sheaf-theoretic interpretation of stratification learning from geometric and topological perspectives. Our main result is the construction of stratification learning algorithms framed in terms of a sheaf on a partially ordered set with the Alexandroff topology. We prove that the resulting decomposition is the unique minimal stratification for which the strata are homogeneous and the given sheaf is constructible. In particular, when we choose to work with the local homology sheaf, our algorithm gives an alternative to the local homology transfer algorithm given in Bendich et al. (2012), and the cohomology stratification algorithm given in Nanda (2017). Additionally, we give examples of stratifications based on the geometric techniques of Breiding et al. (2018), illustrating how the sheaf-theoretic approach can be used to study stratifications from both topological and geometric perspectives. This approach also points toward future applications of sheaf theory in the study of topological data analysis by illustrating the utility of the language of sheaf theory in generalizing existing algorithms

    A fast approximate skeleton with guarantees for any cloud of points in a Euclidean space

    Get PDF
    The tree reconstruction problem is to find an embedded straight-line tree that approximates a given cloud of unorganized points in Rm\mathbb{R}^m up to a certain error. A practical solution to this problem will accelerate a discovery of new colloidal products with desired physical properties such as viscosity. We define the Approximate Skeleton of any finite point cloud CC in a Euclidean space with theoretical guarantees. The Approximate Skeleton ASk(C)(C) always belongs to a given offset of CC, i.e. the maximum distance from CC to ASk(C)(C) can be a given maximum error. The number of vertices in the Approximate Skeleton is close to the minimum number in an optimal tree by factor 2. The new Approximate Skeleton of any unorganized point cloud CC is computed in a near linear time in the number of points in CC. Finally, the Approximate Skeleton outperforms past skeletonization algorithms on the size and accuracy of reconstruction for a large dataset of real micelles and random clouds

    Mapper on Graphs for Network Visualization

    Full text link
    Networks are an exceedingly popular type of data for representing relationships between individuals, businesses, proteins, brain regions, telecommunication endpoints, etc. Network or graph visualization provides an intuitive way to explore the node-link structures of network data for instant sense-making. However, naive node-link diagrams can fail to convey insights regarding network structures, even for moderately sized data of a few hundred nodes. We propose to apply the mapper construction--a popular tool in topological data analysis--to graph visualization, which provides a strong theoretical basis for summarizing network data while preserving their core structures. We develop a variation of the mapper construction targeting weighted, undirected graphs, called mapper on graphs, which generates property-preserving summaries of graphs. We provide a software tool that enables interactive explorations of such summaries and demonstrates the effectiveness of our method for synthetic and real-world data. The mapper on graphs approach we propose represents a new class of techniques that leverages tools from topological data analysis in addressing challenges in graph visualization

    Statistical analysis of Mapper for stochastic and multivariate filters

    Full text link
    Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in Topological Data Analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works [BBMW19, CO17, CMO18, MW16], focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, when the codomain of the filter is Rp\mathbb{R}^p, and in the general case, when it is a general metric space (Z,dZ)(Z, d_Z), instead of R\mathbb{R}. The few results that are available in this setting [DMW17, MW16] can only handle continuous topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we introduce a slight modification of the usual Mapper construction and we give risk bounds for estimating the Reeb space using this estimator. Our approach applies in particular to the setting where the filter function used to compute Mapper is also estimated from data, such as the eigenfunctions of PCA. Our results are given with respect to the Gromov-Hausdorff distance, computed with specific filter-based pseudometrics for Mappers and Reeb spaces defined in [DMW17]. We finally provide applications of this setting in statistics and machine learning for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approac
    corecore