    Principal Geodesic Analysis of Merge Trees (and Persistence Diagrams)

    This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a lightweight C++ implementation that can be used to reproduce our results

    Wasserstein Auto-Encoders of Merge Trees (and Persistence Diagrams)

    This paper presents a computational framework for the Wasserstein auto-encoding of merge trees (MT-WAE), a novel extension of the classical auto-encoder neural network architecture to the Wasserstein metric space of merge trees. In contrast to traditional auto-encoders which operate on vectorized data, our formulation explicitly manipulates merge trees on their associated metric space at each layer of the network, resulting in superior accuracy and interpretability. Our novel neural network approach can be interpreted as a non-linear generalization of previous linear attempts [79] at merge tree encoding. It also trivially extends to persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our algorithms, with MT-WAE computations in the orders of minutes on average. We show the utility of our contributions in two applications adapted from previous work on merge tree encoding [79]. First, we apply MT-WAE to merge tree compression, by concisely representing them with their coordinates in the final layer of our auto-encoder. Second, we document an application to dimensionality reduction, by exploiting the latent space of our auto-encoder, for the visual analysis of ensemble data. We illustrate the versatility of our framework by introducing two penalty terms, to help preserve in the latent space both the Wasserstein distances between merge trees, as well as their clusters. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used for reproducibility.Comment: arXiv admin note: text overlap with arXiv:2207.1096

    Comparing Morse Complexes Using Optimal Transport: An Experimental Study

    Morse complexes and Morse-Smale complexes are topological descriptors popular in topology-based visualization. Comparing these complexes plays an important role in their applications in feature correspondences, feature tracking, symmetry detection, and uncertainty visualization. Leveraging recent advances in optimal transport, we apply a class of optimal transport distances to the comparative analysis of Morse complexes. Contrasting with existing comparative measures, such distances are easy and efficient to compute, and naturally provide structural matching between Morse complexes. We perform an experimental study involving scientific simulation datasets and discuss the effectiveness of these distances as comparative measures for Morse complexes. We also provide an initial guideline for choosing the optimal transport distances under various data assumptions.Comment: IEEE Visualization Conference (IEEE VIS) Short Paper, accepted, 2023; supplementary materials: http://www.sci.utah.edu/~beiwang/publications/GWMC_VIS_Short_BeiWang_2023_Supplement.pd

    A Comparative Study of the Perceptual Sensitivity of Topological Visualizations to Feature Variations

    Full text link
    Color maps are a commonly used visualization technique in which data are mapped to optical properties, e.g., color or opacity. Color maps, however, do not explicitly convey structures (e.g., positions and scale of features) within data. Topology-based visualizations reveal and explicitly communicate structures underlying data. Although we have a good understanding of what types of features are captured by topological visualizations, our understanding of people's perception of those features is not. This paper evaluates the sensitivity of topology-based isocontour, Reeb graph, and persistence diagram visualizations compared to a reference color map visualization for synthetically generated scalar fields on 2-manifold triangular meshes embedded in 3D. In particular, we built and ran a human-subject study that evaluated the perception of data features characterized by Gaussian signals and measured how effectively each visualization technique portrays variations of data features arising from the position and amplitude variation of a mixture of Gaussians. For positional feature variations, the results showed that only the Reeb graph visualization had high sensitivity. For amplitude feature variations, persistence diagrams and color maps demonstrated the highest sensitivity, whereas isocontours showed only weak sensitivity. These results take an important step toward understanding which topology-based tools are best for various data and task scenarios and their effectiveness in conveying topological variations as compared to conventional color mapping

    Feature-Based Uncertainty Visualization

    While uncertainty in scientific data attracts an increasing research interest in the visualization community, two critical issues remain insufficiently studied: (1) visualizing the impact of the uncertainty of a data set on its features and (2) interactively exploring 3D or large 2D data sets with uncertainties. In this study, a suite of feature-based techniques is developed to address these issues. First, a framework of feature-level uncertainty visualization is presented to study the uncertainty of the features in scalar and vector data. The uncertainty in the number and locations of features such as sinks or sources of vector fields are referred to as feature-level uncertainty while the uncertainty in the numerical values of the data is referred to as data-level uncertainty. The features of different ensemble members are indentified and correlated. The feature-level uncertainties are expressed as the transitions between corresponding features through new elliptical glyphs. Second, an interactive visualization tool for exploring scalar data with data-level and two types of feature-level uncertainties — contour-level and topology-level uncertainties — is developed. To avoid visual cluttering and occlusion, the uncertainty information is attached to a contour tree instead of being integrated with the visualization of the data. An efficient contour tree-based interface is designed to reduce users’ workload in viewing and analyzing complicated data with uncertainties and to facilitate a quick and accurate selection of prominent contours. This thesis advances the current uncertainty studies with an in-depth investigation of the feature-level uncertainties and an exploration of topology tools for effective and interactive uncertainty visualizations. With quantified representation and interactive capability, feature-based visualization helps people gain new insights into the uncertainties of their data, especially the uncertainties of extracted features which otherwise would remain unknown with the visualization of only data-level uncertainties

    Comparative Uncertainty Visualization for High-Level Analysis of Scalar- and Vector-Valued Ensembles

    With this thesis, I contribute to the research field of uncertainty visualization, considering parameter dependencies in multi valued fields and the uncertainty of automated data analysis. Like uncertainty visualization in general, both of these fields are becoming more and more important due to increasing computational power, growing importance and availability of complex models and collected data, and progress in artificial intelligence. I contribute in the following application areas: Uncertain Topology of Scalar Field Ensembles. The generalization of topology-based visualizations to multi valued data involves many challenges. An example is the comparative visualization of multiple contour trees, complicated by the random nature of prevalent contour tree layout algorithms. I present a novel approach for the comparative visualization of contour trees - the Fuzzy Contour Tree. Uncertain Topological Features in Time-Dependent Scalar Fields. Tracking features in time-dependent scalar fields is an active field of research, where most approaches rely on the comparison of consecutive time steps. I created a more holistic visualization for time-varying scalar field topology by adapting Fuzzy Contour Trees to the time-dependent setting. Uncertain Trajectories in Vector Field Ensembles. Visitation maps are an intuitive and well-known visualization of uncertain trajectories in vector field ensembles. For large ensembles, visitation maps are not applicable, or only with extensive time requirements. I developed Visitation Graphs, a new representation and data reduction method for vector field ensembles that can be calculated in situ and is an optimal basis for the efficient generation of visitation maps. This is accomplished by bringing forward calculation times to the pre-processing. Visually Supported Anomaly Detection in Cyber Security. Numerous cyber attacks and the increasing complexity of networks and their protection necessitate the application of automated data analysis in cyber security. Due to uncertainty in automated anomaly detection, the results need to be communicated to analysts to ensure appropriate reactions. I introduce a visualization system combining device readings and anomaly detection results: the Security in Process System. To further support analysts I developed an application agnostic framework that supports the integration of knowledge assistance and applied it to the Security in Process System. I present this Knowledge Rocks Framework, its application and the results of evaluations for both, the original and the knowledge assisted Security in Process System. For all presented systems, I provide implementation details, illustrations and applications

    Contrasting Climate Ensembles: A Model-based Visualization Approach for Analyzing Extreme Events

    AbstractThe use of increasingly sophisticated means to simulate and observe natural phenomena has led to the production of larger and more complex data. As the size and complexity of this data increases, the task of data analysis becomes more challeng- ing. Determining complex relationships among variables requires new algorithm development. Addressing the challenge of handling large data necessitates that algorithm implementations target high performance computing platforms. In this work we present a technique that allows a user to study the interactions among multiple variables in the same spatial extents as the underlying data. The technique is implemented in an existing parallel analysis and visualization framework in order that it be applicable to the largest datasets. The foundation of our approach is to classify data points via inclusion in, or distance to, multivariate representations of relationships among a subset of the variables of a dataset. We abstract the space in which inclusion is calculated and through various space transformations we alleviate the necessity to consider variables’ scales and distributions when making comparisons. We apply this approach to the problem of highlighting variations in climate model ensembles
