150 research outputs found
Visualizing Sparse Filtrations
Over the last few years, there have been several approaches to building sparser complexes that still give good approximations to the persistent homology. In this video, we have illustrated a geometric perspective on sparse filtrations that leads to simpler proofs, more general theorems, and a more visual explanation. We hope that as these techniques become easier to understand, they will also become easier to use
A Geometric Perspective on Sparse Filtrations
We present a geometric perspective on sparse filtrations used in topological
data analysis. This new perspective leads to much simpler proofs, while also
being more general, applying equally to Rips filtrations and Cech filtrations
for any convex metric. We also give an algorithm for finding the simplices in
such a filtration and prove that the vertex removal can be implemented as a
sequence of elementary edge collapses
Visual Detection of Structural Changes in Time-Varying Graphs Using Persistent Homology
Topological data analysis is an emerging area in exploratory data analysis
and data mining. Its main tool, persistent homology, has become a popular
technique to study the structure of complex, high-dimensional data. In this
paper, we propose a novel method using persistent homology to quantify
structural changes in time-varying graphs. Specifically, we transform each
instance of the time-varying graph into metric spaces, extract topological
features using persistent homology, and compare those features over time. We
provide a visualization that assists in time-varying graph exploration and
helps to identify patterns of behavior within the data. To validate our
approach, we conduct several case studies on real world data sets and show how
our method can find cyclic patterns, deviations from those patterns, and
one-time events in time-varying graphs. We also examine whether
persistence-based similarity measure as a graph metric satisfies a set of
well-established, desirable properties for graph metrics
Reconstructing forest canopy from the 3D triangulations of airborne laser scanning point data for the visualization and planning of forested landscapes
Key message We present a data-driven technique to visualize forest landscapes and simulate their future development according to alternative management scenarios. Gentle harvesting intensities were preferred for maintaining scenic values in a test of eliciting public's preferences based on the simulated landscapes. Context Visualizations of future forest landscapes according to alternative management scenarios are useful for eliciting stakeholders' preferences on the alternatives. However, conventional computer visualizations require laborious tree-wise measurements or simulators to generate these observations. Aims We describe and evaluate an alternative approach, in which the visualization is based on reconstructing forest canopy from sparse density, leaf-off airborne laser scanning data. Methods Computational geometry was employed to generate filtrations, i.e., ordered sets of simplices belonging to the three-dimensional triangulations of the point data. An appropriate degree of filtering was determined by analyzing the topological persistence of the filtrations. The topology was further utilized to simulate changes to canopy biomass, resembling harvests with varying retention levels. Relative priorities of recreational and scenic values of the harvests were estimated based on pairwise comparisons and analytic hierarchy process (AHP). Results The canopy elements were co-located with the tree stems measured in the field, and the visualizations derived from the entire landscape showed reasonably realistic, despite a low numerical correspondence with plot-level forest attributes. The potential and limitations to improve the proposed parameterization are discussed. Conclusion Although the criteria to evaluate the landscape visualization and simulation models were not conclusive, the results suggest that forest scenes may be feasibly reconstructed based on data already covering broad areas and readily available for practical applications.Peer reviewe
Exact Computation of a Manifold Metric, via Lipschitz Embeddings and Shortest Paths on a Graph
Data-sensitive metrics adapt distances locally based the density of data
points with the goal of aligning distances and some notion of similarity. In
this paper, we give the first exact algorithm for computing a data-sensitive
metric called the nearest neighbor metric. In fact, we prove the surprising
result that a previously published -approximation is an exact algorithm.
The nearest neighbor metric can be viewed as a special case of a
density-based distance used in machine learning, or it can be seen as an
example of a manifold metric. Previous computational research on such metrics
despaired of computing exact distances on account of the apparent difficulty of
minimizing over all continuous paths between a pair of points. We leverage the
exact computation of the nearest neighbor metric to compute sparse spanners and
persistent homology. We also explore the behavior of the metric built from
point sets drawn from an underlying distribution and consider the more general
case of inputs that are finite collections of path-connected compact sets.
The main results connect several classical theories such as the conformal
change of Riemannian metrics, the theory of positive definite functions of
Schoenberg, and screw function theory of Schoenberg and Von Neumann. We develop
novel proof techniques based on the combination of screw functions and
Lipschitz extensions that may be of independent interest.Comment: 15 page
A Sparse Delaunay Filtration
We show how a filtration of Delaunay complexes can be used to approximate the persistence diagram of the distance to a point set in ?^d. Whereas the full Delaunay complex can be used to compute this persistence diagram exactly, it may have size O(n^?d/2?). In contrast, our construction uses only O(n) simplices. The central idea is to connect Delaunay complexes on progressively denser subsamples by considering the flips in an incremental construction as simplices in d+1 dimensions. This approach leads to a very simple and straightforward proof of correctness in geometric terms, because the final filtration is dual to a (d+1)-dimensional Voronoi construction similar to the standard Delaunay filtration. We also, show how this complex can be efficiently constructed
Visualizing Topological Importance: A Class-Driven Approach
This paper presents the first approach to visualize the importance of
topological features that define classes of data. Topological features, with
their ability to abstract the fundamental structure of complex data, are an
integral component of visualization and analysis pipelines. Although not all
topological features present in data are of equal importance. To date, the
default definition of feature importance is often assumed and fixed. This work
shows how proven explainable deep learning approaches can be adapted for use in
topological classification. In doing so, it provides the first technique that
illuminates what topological structures are important in each dataset in
regards to their class label. In particular, the approach uses a learned metric
classifier with a density estimator of the points of a persistence diagram as
input. This metric learns how to reweigh this density such that classification
accuracy is high. By extracting this weight, an importance field on persistent
point density can be created. This provides an intuitive representation of
persistence point importance that can be used to drive new visualizations. This
work provides two examples: Visualization on each diagram directly and, in the
case of sublevel set filtrations on images, directly on the images themselves.
This work highlights real-world examples of this approach visualizing the
important topological features in graph, 3D shape, and medical image data.Comment: 11 pages, 11 figure
K-Nearest-Neighbors Induced Topological PCA for scRNA Sequence Data Analysis
Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity
in cells, which has given us insights into cell-cell communication, cell
differentiation, and differential gene expression. However, analyzing scRNA-seq
data is a challenge due to sparsity and the large number of genes involved.
Therefore, dimensionality reduction and feature selection are important for
removing spurious signals and enhancing downstream analysis. Traditional PCA, a
main workhorse in dimensionality reduction, lacks the ability to capture
geometrical structure information embedded in the data, and previous graph
Laplacian regularizations are limited by the analysis of only a single scale.
We propose a topological Principal Components Analysis (tPCA) method by the
combination of persistent Laplacian (PL) technique and L norm
regularization to address multiscale and multiclass heterogeneity issues in
data. We further introduce a k-Nearest-Neighbor (kNN) persistent Laplacian
technique to improve the robustness of our persistent Laplacian method. The
proposed kNN-PL is a new algebraic topology technique which addresses the many
limitations of the traditional persistent homology. Rather than inducing
filtration via the varying of a distance threshold, we introduced kNN-tPCA,
where filtrations are achieved by varying the number of neighbors in a kNN
network at each step, and find that this framework has significant implications
for hyper-parameter tuning. We validate the efficacy of our proposed tPCA and
kNN-tPCA methods on 11 diverse benchmark scRNA-seq datasets, and showcase that
our methods outperform other unsupervised PCA enhancements from the literature,
as well as popular Uniform Manifold Approximation (UMAP), t-Distributed
Stochastic Neighbor Embedding (tSNE), and Projection Non-Negative Matrix
Factorization (NMF) by significant margins.Comment: 28 pages, 11 figure
- …