19 research outputs found

    Computational and Theoretical Issues of Multiparameter Persistent Homology for Data Analysis

    Get PDF
    The basic goal of topological data analysis is to apply topology-based descriptors to understand and describe the shape of data. In this context, homology is one of the most relevant topological descriptors, well-appreciated for its discrete nature, computability and dimension independence. A further development is provided by persistent homology, which allows to track homological features along a oneparameter increasing sequence of spaces. Multiparameter persistent homology, also called multipersistent homology, is an extension of the theory of persistent homology motivated by the need of analyzing data naturally described by several parameters, such as vector-valued functions. Multipersistent homology presents several issues in terms of feasibility of computations over real-sized data and theoretical challenges in the evaluation of possible descriptors. The focus of this thesis is in the interplay between persistent homology theory and discrete Morse Theory. Discrete Morse theory provides methods for reducing the computational cost of homology and persistent homology by considering the discrete Morse complex generated by the discrete Morse gradient in place of the original complex. The work of this thesis addresses the problem of computing multipersistent homology, to make such tool usable in real application domains. This requires both computational optimizations towards the applications to real-world data, and theoretical insights for finding and interpreting suitable descriptors. Our computational contribution consists in proposing a new Morse-inspired and fully discrete preprocessing algorithm. We show the feasibility of our preprocessing over real datasets, and evaluate the impact of the proposed algorithm as a preprocessing for computing multipersistent homology. A theoretical contribution of this thesis consists in proposing a new notion of optimality for such a preprocessing in the multiparameter context. We show that the proposed notion generalizes an already known optimality notion from the one-parameter case. Under this definition, we show that the algorithm we propose as a preprocessing is optimal in low dimensional domains. In the last part of the thesis, we consider preliminary applications of the proposed algorithm in the context of topology-based multivariate visualization by tracking critical features generated by a discrete gradient field compatible with the multiple scalar fields under study. We discuss (dis)similarities of such critical features with the state-of-the-art techniques in topology-based multivariate data visualization

    Jacobi Fiber Surfaces for Bivariate Reeb Space Computation

    Get PDF
    This paper presents an efficient algorithm for the computation of the Reeb space of an input bivariate piecewise linear scalar function f defined on a tetrahedral mesh. By extending and generalizing algorithmic concepts from the univariate case to the bivariate one, we report the first practical, output-sensitive algorithm for the exact computation of such a Reeb space. The algorithm starts by identifying the Jacobi set of f , the bivariate analogs of critical points in the univariate case. Next, the Reeb space is computed by segmenting the input mesh along the new notion of Jacobi Fiber Surfaces, the bivariate analog of critical contours in the univariate case. We additionally present a simplification heuristic that enables the progressive coarsening of the Reeb space. Our algorithm is simple to implement and most of its computations can be trivially parallelized. We report performance numbers demonstrating orders of magnitude speedups over previous approaches, enabling for the first time the tractable computation of bivariate Reeb spaces in practice. Moreover, unlike range-based quantization approaches (such as the Joint Contour Net), our algorithm is parameter-free. We demonstrate the utility of our approach by using the Reeb space as a semi-automatic segmentation tool for bivariate data. In particular, we introduce continuous scatterplot peeling, a technique which enables the reduction of the cluttering in the continuous scatterplot, by interactively selecting the features of the Reeb space to project. We provide a VTK-based C++ implementation of our algorithm that can be used for reproduction purposes or for the development of new Reeb space based visualization techniques

    The Topology ToolKit

    Full text link
    This system paper presents the Topology ToolKit (TTK), a software platform designed for topological data analysis in scientific visualization. TTK provides a unified, generic, efficient, and robust implementation of key algorithms for the topological analysis of scalar data, including: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smale complexes, fiber surfaces, continuous scatterplots, Jacobi sets, Reeb spaces, and more. TTK is easily accessible to end users due to a tight integration with ParaView. It is also easily accessible to developers through a variety of bindings (Python, VTK/C++) for fast prototyping or through direct, dependence-free, C++, to ease integration into pre-existing complex systems. While developing TTK, we faced several algorithmic and software engineering challenges, which we document in this paper. In particular, we present an algorithm for the construction of a discrete gradient that complies to the critical points extracted in the piecewise-linear setting. This algorithm guarantees a combinatorial consistency across the topological abstractions supported by TTK, and importantly, a unified implementation of topological data simplification for multi-scale exploration and analysis. We also present a cached triangulation data structure, that supports time efficient and generic traversals, which self-adjusts its memory usage on demand for input simplicial meshes and which implicitly emulates a triangulation for regular grids with no memory overhead. Finally, we describe an original software architecture, which guarantees memory efficient and direct accesses to TTK features, while still allowing for researchers powerful and easy bindings and extensions. TTK is open source (BSD license) and its code, online documentation and video tutorials are available on TTK's website

    A Topological Distance between Multi-fields based on Multi-Dimensional Persistence Diagrams

    Full text link
    The problem of computing topological distance between two scalar fields based on Reeb graphs or contour trees has been studied and applied successfully to various problems in topological shape matching, data analysis, and visualization. However, generalizing such results for computing distance measures between two multi-fields based on their Reeb spaces is still in its infancy. Towards this, in the current paper we propose a technique to compute an effective distance measure between two multi-fields by computing a novel \emph{multi-dimensional persistence diagram} (MDPD) corresponding to each of the (quantized) Reeb spaces. First, we construct a multi-dimensional Reeb graph (MDRG), which is a hierarchical decomposition of the Reeb space into a collection of Reeb graphs. The MDPD corresponding to each MDRG is then computed based on the persistence diagrams of the component Reeb graphs of the MDRG. Our distance measure extends the Wasserstein distance between two persistence diagrams of Reeb graphs to MDPDs of MDRGs. We prove that the proposed measure is a pseudo-metric and satisfies a stability property. Effectiveness of the proposed distance measure has been demonstrated in (i) shape retrieval contest data - SHREC 20102010 and (ii) Pt-CO bond detection data from computational chemistry. Experimental results show that the proposed distance measure based on the Reeb spaces has more discriminating power in clustering the shapes and detecting the formation of a stable Pt-CO bond as compared to the similar measures between Reeb graphs.Comment: Acepted in the IEEE Transactions on Visualization and Computer Graphic

    Hypersweeps, Convective Clouds and Reeb Spaces

    Get PDF
    Isosurfaces are one of the most prominent tools in scientific data visualisation. An isosurface is a surface that defines the boundary of a feature of interest in space for a given threshold. This is integral in analysing data from the physical sciences which observe and simulate three or four dimensional phenomena. However it is time consuming and impractical to discover surfaces of interest by manually selecting different thresholds. The systematic way to discover significant isosurfaces in data is with a topological data structure called the contour tree. The contour tree encodes the connectivity and shape of each isosurface at all possible thresholds. The first part of this work has been devoted to developing algorithms that use the contour tree to discover significant features in data using high performance computing systems. Those algorithms provided a clear speedup over previous methods and were used to visualise physical plasma simulations. A major limitation of isosurfaces and contour trees is that they are only applicable when a single property is associated with data points. However scientific data sets often take multiple properties into account. A recent breakthrough generalised isosurfaces to fiber surfaces. Fiber surfaces define the boundary of a feature where the threshold is defined in terms of multiple parameters, instead of just one. In this work we used fiber surfaces together with isosurfaces and the contour tree to create a novel application that helps atmosphere scientists visualise convective cloud formation. Using this application, they were able to, for the first time, visualise the physical properties of certain structures that trigger cloud formation. Contour trees can also be generalised to handle multiple parameters. The natural extension of the contour tree is called the Reeb space and it comes from the pure mathematical field of fiber topology. The Reeb space is not yet fully understood mathematically and algorithms for computing it have significant practical limitations. A key difficulty is that while the contour tree is a traditional one dimensional data structure made up of points and lines between them, the Reeb space is far more complex. The Reeb space is made up of two dimensional sheets, attached to each other in intricate ways. The last part of this work focuses on understanding the structure of Reeb spaces and the rules that are followed when sheets are combined. This theory builds towards developing robust combinatorial algorithms to compute and use Reeb spaces for practical data analysis

    Multivariate Topology Simplification

    Get PDF
    Topological simplification of scalar and vector fields is well-established as an effective method for analysing and visualising complex data sets. For multivariate (alternatively, multi-field) data, topological analysis requires simultaneous advances both mathematically and computationally. We propose a robust multivariate topology simplification method based on “lip”-pruning from the Reeb space. Mathematically, we show that the projection of the Jacobi set of multivariate data into the Reeb space produces a Jacobi structure that separates the Reeb space into simple components. We also show that the dual graph of these components gives rise to a Reeb skeleton that has properties similar to the scalar contour tree and Reeb graph, for topologically simple domains. We then introduce a range measure to give a scaling-invariant total ordering of the components or features that can be used for simplification. Computationally, we show how to compute Jacobi structure, Reeb skeleton, range and geometric measures in the Joint Contour Net (an approximation of the Reeb space) and that these can be used for visualisation similar to the contour tree or Reeb graph

    Pattern search for the visualization of scalar, vector, and line fields

    Get PDF
    The main topic of this thesis is pattern search in data sets for the purpose of visual data analysis. By giving a reference pattern, pattern search aims to discover similar occurrences in a data set with invariance to translation, rotation and scaling. To address this problem, we developed algorithms dealing with different types of data: scalar fields, vector fields, and line fields. For scalar fields, we use the SIFT algorithm (Scale-Invariant Feature Transform) to find a sparse sampling of prominent features in the data with invariance to translation, rotation, and scaling. Then, the user can define a pattern as a set of SIFT features by e.g. brushing a region of interest. Finally, we locate and rank matching patterns in the entire data set. Due to the sparsity and accuracy of SIFT features, we achieve fast and memory-saving pattern query in large scale scalar fields. For vector fields, we propose a hashing strategy in scale space to accelerate the convolution-based pattern query. We encode the local flow behavior in scale space using a sequence of hierarchical base descriptors, which are pre-computed and hashed into a number of hash tables. This ensures a fast fetching of similar occurrences in the flow and requires only a constant number of table lookups. For line fields, we present a stream line segmentation algorithm to split long stream lines into globally-consistent segments, which provides similar segmentations for similar flow structures. It gives the benefit of isolating a pattern from long and dense stream lines, so that our patterns can be defined sparsely and have a significant extent, i.e., they are integration-based and not local. This allows for a greater flexibility in defining features of interest. For user-defined patterns of curve segments, our algorithm finds similar ones that are invariant to similarity transformations. Additionally, we present a method for shape recovery from multiple views. This semi-automatic method fits a template mesh to high-resolution normal data. In contrast to existing 3D reconstruction approaches, we accelerate the data acquisition time by omitting the structured light scanning step of obtaining low frequency 3D information.Das Hauptthema dieser Arbeit ist die Mustersuche in DatensĂ€tzen zur visuellen Datenanalyse. Durch die Vorgabe eines Referenzmusters versucht die Mustersuche Ă€hnliche Vorkommen in einem Datensatz mit Translations-, Rotations- und Skalierungsinvarianz zu entdecken. In diesem Zusammenhang haben wir Algorithmen entwickelt, die sich mit verschiedenen Arten von Daten befassen: Skalarfelder, Vektorfelder und Linienfelder. Bei Skalarfeldern benutzen wir den SIFT-Algorithmus (Scale-Invariant Feature Transform), um ein spĂ€rliches Abtasten von markanten Merkmalen in Daten mit Translations-, Rotations- und Skalierungsinvarianz zu finden. Danach kann der Benutzer ein Muster als Menge von SIFT-Merkmalspunkten definieren, zum Beispiel durch Markieren einer interessierenden Region. Schließlich lokalisieren wir passende Muster im gesamten Datensatz und stufen sie ein. Aufgrund der spĂ€rlichen Verteilung und der Genauigkeit von SIFT-Merkmalspunkten erreichen wir eine schnelle und speichersparende Musterabfrage in großen Skalarfeldern. FĂŒr Vektorfelder schlagen wir eine Hashing-Strategie zur Beschleunigung der faltungsbasierten Musterabfrage im Skalenraum vor. Wir kodieren das lokale Flussverhalten im Skalenraum durch eine Sequenz von hierarchischen Basisdeskriptoren, welche vorberechnet und als Zahlen in einer Hashtabelle gespeichert sind. Dies stellt eine schnelle Abfrage von Ă€hnlichen Vorkommen im Fluss sicher und benötigt lediglich eine konstante Anzahl von Nachschlageoperationen in der Tabelle. FĂŒr Linienfelder prĂ€sentieren wir einen Algorithmus zur Segmentierung von Stromlinien, um lange Stromlinen in global konsistente Segmente aufzuteilen. Dies erlaubt eine grĂ¶ĂŸere FlexibilitĂ€t bei der Definition von Mustern. FĂŒr vom Benutzer definierte Muster von Kurvensegmenten findet unser Algorithmus Ă€hnliche Kurvensegmente, die unter Ähnlichkeitstransformationen invariant sind. ZusĂ€tzlich prĂ€sentieren wir eine Methode zur Rekonstruktion von Formen aus mehreren Ansichten. Diese halbautomatische Methode passt ein Template an hochauflösendeNormalendatenan. Im Gegensatz zu existierenden 3D-Rekonstruktionsverfahren beschleunigen wir die Datenaufnahme, indem wir auf die Streifenprojektion verzichten, um niederfrequente 3D Informationen zu gewinnen