889 research outputs found

    Progressive Wasserstein Barycenters of Persistence Diagrams

    Full text link
    This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main features of interest found in the set. Such representative diagrams are obtained by computing explicitly the discrete Wasserstein barycenter of the set of persistence diagrams, a notoriously computationally intensive task. In particular, we revisit efficient algorithms for Wasserstein distance approximation [12,51] to extend previous work on barycenter estimation [94]. We present a new fast algorithm, which progressively approximates the barycenter by iteratively increasing the computation accuracy as well as the number of persistent features in the output diagram. Such a progressivity drastically improves convergence in practice and allows to design an interruptible algorithm, capable of respecting computation time constraints. This enables the approximation of Wasserstein barycenters within interactive times. We present an application to ensemble clustering where we revisit the k-means algorithm to exploit our barycenters and compute, within execution time constraints, meaningful clusters of ensemble data along with their barycenter diagram. Extensive experiments on synthetic and real-life data sets report that our algorithm converges to barycenters that are qualitatively meaningful with regard to the applications, and quantitatively comparable to previous techniques, while offering an order of magnitude speedup when run until convergence (without time constraint). Our algorithm can be trivially parallelized to provide additional speedups in practice on standard workstations. [...

    Feature-Based Uncertainty Visualization

    Get PDF
    While uncertainty in scientific data attracts an increasing research interest in the visualization community, two critical issues remain insufficiently studied: (1) visualizing the impact of the uncertainty of a data set on its features and (2) interactively exploring 3D or large 2D data sets with uncertainties. In this study, a suite of feature-based techniques is developed to address these issues. First, a framework of feature-level uncertainty visualization is presented to study the uncertainty of the features in scalar and vector data. The uncertainty in the number and locations of features such as sinks or sources of vector fields are referred to as feature-level uncertainty while the uncertainty in the numerical values of the data is referred to as data-level uncertainty. The features of different ensemble members are indentified and correlated. The feature-level uncertainties are expressed as the transitions between corresponding features through new elliptical glyphs. Second, an interactive visualization tool for exploring scalar data with data-level and two types of feature-level uncertainties — contour-level and topology-level uncertainties — is developed. To avoid visual cluttering and occlusion, the uncertainty information is attached to a contour tree instead of being integrated with the visualization of the data. An efficient contour tree-based interface is designed to reduce users’ workload in viewing and analyzing complicated data with uncertainties and to facilitate a quick and accurate selection of prominent contours. This thesis advances the current uncertainty studies with an in-depth investigation of the feature-level uncertainties and an exploration of topology tools for effective and interactive uncertainty visualizations. With quantified representation and interactive capability, feature-based visualization helps people gain new insights into the uncertainties of their data, especially the uncertainties of extracted features which otherwise would remain unknown with the visualization of only data-level uncertainties

    Ovis: A framework for visual analysis of ocean forecast ensembles

    Get PDF
    pre-printWe present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations of the sea surface height that is used in ocean forecasting. The position of eddies can be derived directly from the sea surface height and our visualization approach enables their interactive exploration and analysis.The behavior of eddies is important in different application settings of which we present two in this paper. First, we show an application for interactive planning of placement as well as operation of off-shore structures using real-world ensemble simulation data of the Gulf of Mexico. Off-shore structures, such as those used for oil exploration, are vulnerable to hazards caused by eddies, and the oil and gas industry relies on ocean forecasts for efficient operations. We enable analysis of the spatial domain, as well as the temporal evolution, for planning the placement and operation of structures.Eddies are also important for marine life. They transport water over large distances and with it also heat and other physical properties as well as biological organisms. In the second application we present the usefulness of our tool, which could be used for planning the paths of autonomous underwater vehicles, so called gliders, for marine scientists to study simulation data of the largely unexplored Red Sea

    Principal Geodesic Analysis of Merge Trees (and Persistence Diagrams)

    Full text link
    This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a lightweight C++ implementation that can be used to reproduce our results

    Visual Analysis of Variability and Features of Climate Simulation Ensembles

    Get PDF
    This PhD thesis is concerned with the visual analysis of time-dependent scalar field ensembles as occur in climate simulations. Modern climate projections consist of multiple simulation runs (ensemble members) that vary in parameter settings and/or initial values, which leads to variations in the resulting simulation data. The goal of ensemble simulations is to sample the space of possible futures under the given climate model and provide quantitative information about uncertainty in the results. The analysis of such data is challenging because apart from the spatiotemporal data, also variability has to be analyzed and communicated. This thesis presents novel techniques to analyze climate simulation ensembles visually. A central question is how the data can be aggregated under minimized information loss. To address this question, a key technique applied in several places in this work is clustering. The first part of the thesis addresses the challenge of finding clusters in the ensemble simulation data. Various distance metrics lend themselves for the comparison of scalar fields which are explored theoretically and practically. A visual analytics interface allows the user to interactively explore and compare multiple parameter settings for the clustering and investigate the resulting clusters, i.e. prototypical climate phenomena. A central contribution here is the development of design principles for analyzing variability in decadal climate simulations, which has lead to a visualization system centered around the new Clustering Timeline. This is a variant of a Sankey diagram that utilizes clustering results to communicate climatic states over time coupled with ensemble member agreement. It can reveal several interesting properties of the dataset, such as: into how many inherently similar groups the ensemble can be divided at any given time, whether the ensemble diverges in general, whether there are different phases in the time lapse, maybe periodicity, or outliers. The Clustering Timeline is also used to compare multiple climate simulation models and assess their performance. The Hierarchical Clustering Timeline is an advanced version of the above. It introduces the concept of a cluster hierarchy that may group the whole dataset down to the individual static scalar fields into clusters of various sizes and densities recording the nesting relationship between them. One more contribution of this work in terms of visualization research is, that ways are investigated how to practically utilize a hierarchical clustering of time-dependent scalar fields to analyze the data. To this end, a system of different views is proposed which are linked through various interaction possibilities. The main advantage of the system is that a dataset can now be inspected at an arbitrary level of detail without having to recompute a clustering with different parameters. Interesting branches of the simulation can be expanded to reveal smaller differences in critical clusters or folded to show only a coarse representation of the less interesting parts of the dataset. The last building block of the suit of visual analysis methods developed for this thesis aims at a robust, (largely) automatic detection and tracking of certain features in a scalar field ensemble. Techniques are presented that I found can identify and track super- and sub-levelsets. And I derive “centers of action” from these sets which mark the location of extremal climate phenomena that govern the weather (e.g. Icelandic Low and Azores High). The thesis also presents visual and quantitative techniques to evaluate the temporal change of the positions of these centers; such a displacement would be likely to manifest in changes in weather. In a preliminary analysis with my collaborators, we indeed observed changes in the loci of the centers of action in a simulation with increased greenhouse gas concentration as compared to pre-industrial concentration levels

    Gaussian Processes for Uncertainty Visualization

    Get PDF
    Data is virtually always uncertain in one way or another. Yet, uncertainty information is not routinely included in visualizations and, outside of simple 1D diagrams, there is no established way to do it. One big issue is to find a method that shows the uncertainty without completely cluttering the display. A second important question that needs to be solved, is how uncertainty and interpolation interact. Interpolated values are inherently uncertain, because they are heuristically estimated values – not measurements. But how much more uncertain are they? How can this effect be modeled? In this thesis, we introduce Gaussian processes, a statistical framework that allows for the smooth interpolation of data with heteroscedastic uncertainty through regression. Its theoretical background makes it a convincing method to analyze uncertain data and create a model of the underlying phenomenon and, most importantly, the uncertainty at and in-between the data points. For this reason, it is already popular in the GIS community where it is known as Kriging but has applications in machine learning too. In contrast to traditional interpolation methods, Gaussian processes do not merely create a surface that runs through the data points, but respect the uncertainty in them. This way, noise, errors or outliers in the data do not disturb the model inappropriately. Most importantly, the model shows the variance in the interpolated values, which can be higher but also lower than that of its neighboring data points, providing us with a lot more insight into the quality of our data and how it influences our uncertainty! This enables us to use uncertainty information in algorithms that need to interpolate between data points, which includes almost all visualization algorithms
    • …
    corecore