227 research outputs found
Statistical Analysis and Parameter Selection for Mapper
In this article, we study the question of the statistical convergence of the
1-dimensional Mapper to its continuous analogue, the Reeb graph. We show that
the Mapper is an optimal estimator of the Reeb graph, which gives, as a
byproduct, a method to automatically tune its parameters and compute confidence
regions on its topological features, such as its loops and flares. This allows
to circumvent the issue of testing a large grid of parameters and keeping the
most stable ones in the brute-force setting, which is widely used in
visualization, clustering and feature selection with the Mapper.Comment: Minor modification
Avoiding the Global Sort: A Faster Contour Tree Algorithm
We revisit the classical problem of computing the \emph{contour tree} of a
scalar field , where is a
triangulated simplicial mesh in . The contour tree is a
fundamental topological structure that tracks the evolution of level sets of
and has numerous applications in data analysis and visualization.
All existing algorithms begin with a global sort of at least all critical
values of , which can require (roughly) time. Existing
lower bounds show that there are pathological instances where this sort is
required. We present the first algorithm whose time complexity depends on the
contour tree structure, and avoids the global sort for non-pathological inputs.
If denotes the set of critical points in , the running time is
roughly , where is the depth of in
the contour tree. This matches all existing upper bounds, but is a significant
improvement when the contour tree is short and fat. Specifically, our approach
ensures that any comparison made is between nodes in the same descending path
in the contour tree, allowing us to argue strong optimality properties of our
algorithm.
Our algorithm requires several novel ideas: partitioning in
well-behaved portions, a local growing procedure to iteratively build contour
trees, and the use of heavy path decompositions for the time complexity
analysis
Computational Complexity of the Interleaving Distance
The interleaving distance is arguably the most prominent distance measure in
topological data analysis. In this paper, we provide bounds on the
computational complexity of determining the interleaving distance in several
settings. We show that the interleaving distance is NP-hard to compute for
persistence modules valued in the category of vector spaces. In the specific
setting of multidimensional persistent homology we show that the problem is at
least as hard as a matrix invertibility problem. Furthermore, this allows us to
conclude that the interleaving distance of interval decomposable modules
depends on the characteristic of the field. Persistence modules valued in the
category of sets are also studied. As a corollary, we obtain that the
isomorphism problem for Reeb graphs is graph isomorphism complete.Comment: Discussion related to the characteristic of the field added. Paper
accepted to the 34th International Symposium on Computational Geometr
The Topology ToolKit
This system paper presents the Topology ToolKit (TTK), a software platform
designed for topological data analysis in scientific visualization. TTK
provides a unified, generic, efficient, and robust implementation of key
algorithms for the topological analysis of scalar data, including: critical
points, integral lines, persistence diagrams, persistence curves, merge trees,
contour trees, Morse-Smale complexes, fiber surfaces, continuous scatterplots,
Jacobi sets, Reeb spaces, and more. TTK is easily accessible to end users due
to a tight integration with ParaView. It is also easily accessible to
developers through a variety of bindings (Python, VTK/C++) for fast prototyping
or through direct, dependence-free, C++, to ease integration into pre-existing
complex systems. While developing TTK, we faced several algorithmic and
software engineering challenges, which we document in this paper. In
particular, we present an algorithm for the construction of a discrete gradient
that complies to the critical points extracted in the piecewise-linear setting.
This algorithm guarantees a combinatorial consistency across the topological
abstractions supported by TTK, and importantly, a unified implementation of
topological data simplification for multi-scale exploration and analysis. We
also present a cached triangulation data structure, that supports time
efficient and generic traversals, which self-adjusts its memory usage on demand
for input simplicial meshes and which implicitly emulates a triangulation for
regular grids with no memory overhead. Finally, we describe an original
software architecture, which guarantees memory efficient and direct accesses to
TTK features, while still allowing for researchers powerful and easy bindings
and extensions. TTK is open source (BSD license) and its code, online
documentation and video tutorials are available on TTK's website
Task-based Augmented Reeb Graphs with Dynamic ST-Trees
International audienceThis paper presents, to the best of our knowledge, the first parallel algorithm for the computation of the augmented Reeb graph of piecewise linear scalar data. Such augmented Reeb graphs have a wide range of applications , including contour seeding and feature based segmentation. Our approach targets shared-memory multi-core workstations. For this, it completely revisits the optimal, but sequential, Reeb graph algorithm, which is capable of handing data in arbitrary dimension and with optimal time complexity. We take advantage of Fibonacci heaps to exploit the ST-Tree data structure through independent local propagations, while maintaining the optimal, linearithmic time complexity of the sequential reference algorithm. These independent propagations can be expressed using OpenMP tasks, hence benefiting in parallel from the dynamic load balancing of the task runtime while enabling us to increase the parallelism degree thanks to a dual sweep. We present performance results on triangulated surfaces and tetrahedral meshes. We provide comparisons to related work and show that our new algorithm results in superior time performance in practice, both in sequential and in parallel. An open-source C++ implementation is provided for reproducibility
Contours in Visualization
This thesis studies the visualization of set collections either via or defines as the relations among contours.
In the first part, dynamic Euler diagrams are used to communicate and improve semimanually the result of clustering methods which allow clusters to overlap arbitrarily. The contours of the Euler diagram are rendered as implicit surfaces called blobs in computer graphics. The interaction metaphor is the moving of items into or out of these blobs. The utility of the method is demonstrated on data arising from the analysis of gene expressions. The method works well for small datasets of up to one hundred items and few clusters.
In the second part, these limitations are mitigated employing a GPU-based rendering of Euler diagrams and mixing textures and colors to resolve overlapping regions better. The GPU-based approach subdivides the screen into triangles on which it performs a contour interpolation, i.e. a fragment shader determines for each pixel which zones of an Euler diagram it belongs to. The rendering speed is thus increased to allow multiple hundred items. The method is applied to an example comparing different document clustering results.
The contour tree compactly describes scalar field topology. From the viewpoint of graph drawing, it is a tree with attributes at vertices and optionally on edges. Standard tree drawing algorithms emphasize structural properties of the tree and neglect the attributes. Adapting popular graph drawing approaches to the problem of contour tree drawing it is found that they are unable to convey this information. Five aesthetic criteria for drawing contour trees are proposed and a novel algorithm for drawing contour trees in the plane that satisfies four of these criteria is presented. The implementation is fast and effective for contour tree sizes usually used in interactive systems and also produces readable pictures for larger trees.
Dynamical models that explain the formation of spatial structures of RNA molecules have reached a complexity that requires novel visualization methods to analyze these model\''s validity. The fourth part of the thesis focuses on the visualization of so-called folding landscapes of a growing RNA molecule. Folding landscapes describe the energy of a molecule as a function of its spatial configuration; they are huge and high dimensional. Their most salient features are described by their so-called barrier tree -- a contour tree for discrete observation spaces. The changing folding landscapes of a growing RNA chain are visualized as an animation of the corresponding barrier tree sequence. The animation is created as an adaption of the foresight layout with tolerance algorithm for dynamic graph layout. The adaptation requires changes to the concept of supergraph and it layout.
The thesis finishes with some thoughts on how these approaches can be combined and how the task the application should support can help inform the choice of visualization modality
Jacobi Fiber Surfaces for Bivariate Reeb Space Computation
This paper presents an efficient algorithm for the computation of the Reeb space of an input bivariate piecewise linear scalar function f defined on a tetrahedral mesh. By extending and generalizing algorithmic concepts from the univariate case to the bivariate one, we report the first practical, output-sensitive algorithm for the exact computation of such a Reeb space. The algorithm starts by identifying the Jacobi set of f , the bivariate analogs of critical points in the univariate case. Next, the Reeb space is computed by segmenting the input mesh along the new notion of Jacobi Fiber Surfaces, the bivariate analog of critical contours in the univariate case. We additionally present a simplification heuristic that enables the progressive coarsening of the Reeb space. Our algorithm is simple to implement and most of its computations can be trivially parallelized. We report performance numbers demonstrating orders of magnitude speedups over previous approaches, enabling for the first time the tractable computation of bivariate Reeb spaces in practice. Moreover, unlike range-based quantization approaches (such as the Joint Contour Net), our algorithm is parameter-free. We demonstrate the utility of our approach by using the Reeb space as a semi-automatic segmentation tool for bivariate data. In particular, we introduce continuous scatterplot peeling, a technique which enables the reduction of the cluttering in the continuous scatterplot, by interactively selecting the features of the Reeb space to project. We provide a VTK-based C++ implementation of our algorithm that can be used for reproduction purposes or for the development of new Reeb space based visualization techniques
Statistical analysis of Mapper for stochastic and multivariate filters
Reeb spaces, as well as their discretized versions called Mappers, are common
descriptors used in Topological Data Analysis, with plenty of applications in
various fields of science, such as computational biology and data
visualization, among others. The stability and quantification of the rate of
convergence of the Mapper to the Reeb space has been studied a lot in recent
works [BBMW19, CO17, CMO18, MW16], focusing on the case where a scalar-valued
filter is used for the computation of Mapper. On the other hand, much less is
known in the multivariate case, when the codomain of the filter is
, and in the general case, when it is a general metric space , instead of . The few results that are available in this
setting [DMW17, MW16] can only handle continuous topological spaces and cannot
be used as is for finite metric spaces representing data, such as point clouds
and distance matrices. In this article, we introduce a slight modification of
the usual Mapper construction and we give risk bounds for estimating the Reeb
space using this estimator. Our approach applies in particular to the setting
where the filter function used to compute Mapper is also estimated from data,
such as the eigenfunctions of PCA. Our results are given with respect to the
Gromov-Hausdorff distance, computed with specific filter-based pseudometrics
for Mappers and Reeb spaces defined in [DMW17]. We finally provide applications
of this setting in statistics and machine learning for different kinds of
target filters, as well as numerical experiments that demonstrate the relevance
of our approac
- …