2,289 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationCorrelation is a powerful relationship measure used in many fields to estimate trends and make forecasts. When the data are complex, large, and high dimensional, correlation identification is challenging. Several visualization methods have been proposed to solve these problems, but they all have limitations in accuracy, speed, or scalability. In this dissertation, we propose a methodology that provides new visual designs that show details when possible and aggregates when necessary, along with robust interactive mechanisms that together enable quick identification and investigation of meaningful relationships in large and high-dimensional data. We propose four techniques using this methodology. Depending on data size and dimensionality, the most appropriate visualization technique can be provided to optimize the analysis performance. First, to improve correlation identification tasks between two dimensions, we propose a new correlation task-specific visualization method called correlation coordinate plot (CCP). CCP transforms data into a powerful coordinate system for estimating the direction and strength of correlations among dimensions. Next, we propose three visualization designs to optimize correlation identification tasks in large and multidimensional data. The first is snowflake visualization (Snowflake), a focus+context layout for exploring all pairwise correlations. The next proposed design is a new interactive design for representing and exploring data relationships in parallel coordinate plots (PCPs) for large data, called data scalable parallel coordinate plots (DSPCP). Finally, we propose a novel technique for storing and accessing the multiway dependencies through visualization (MultiDepViz). We evaluate these approaches by using various use cases, compare them to prior work, and generate user studies to demonstrate how our proposed approaches help users explore correlation in large data efficiently. Our results confirmed that CCP/Snowflake, DSPCP, and MultiDepViz methods outperform some current visualization techniques such as scatterplots (SCPs), PCPs, SCP matrix, Corrgram, Angular Histogram, and UntangleMap in both accuracy and timing. Finally, these approaches are applied in real-world applications such as a debugging tool, large-scale code performance data, and large-scale climate data

    Angular-based Edge Bundled Parallel Coordinates Plot for the Visual Analysis of Large Ensemble Simulation Data

    Full text link
    With the continuous increase in the computational power and resources of modern high-performance computing (HPC) systems, large-scale ensemble simulations have become widely used in various fields of science and engineering, and especially in meteorological and climate science. It is widely known that the simulation outputs are large time-varying, multivariate, and multivalued datasets which pose a particular challenge to the visualization and analysis tasks. In this work, we focused on the widely used Parallel Coordinates Plot (PCP) to analyze the interrelations between different parameters, such as variables, among the members. However, PCP may suffer from visual cluttering and drawing performance with the increase on the data size to be analyzed, that is, the number of polylines. To overcome this problem, we present an extension to the PCP by adding B\'{e}zier curves connecting the angular distribution plots representing the mean and variance of the inclination of the line segments between parallel axes. The proposed Angular-based Parallel Coordinates Plot (APCP) is capable of presenting a simplified overview of the entire ensemble data set while maintaining the correlation information between the adjacent variables. To verify its effectiveness, we developed a visual analytics prototype system and evaluated by using a meteorological ensemble simulation output from the supercomputer Fugaku

    Visual Analysis of Variability and Features of Climate Simulation Ensembles

    Get PDF
    This PhD thesis is concerned with the visual analysis of time-dependent scalar field ensembles as occur in climate simulations. Modern climate projections consist of multiple simulation runs (ensemble members) that vary in parameter settings and/or initial values, which leads to variations in the resulting simulation data. The goal of ensemble simulations is to sample the space of possible futures under the given climate model and provide quantitative information about uncertainty in the results. The analysis of such data is challenging because apart from the spatiotemporal data, also variability has to be analyzed and communicated. This thesis presents novel techniques to analyze climate simulation ensembles visually. A central question is how the data can be aggregated under minimized information loss. To address this question, a key technique applied in several places in this work is clustering. The first part of the thesis addresses the challenge of finding clusters in the ensemble simulation data. Various distance metrics lend themselves for the comparison of scalar fields which are explored theoretically and practically. A visual analytics interface allows the user to interactively explore and compare multiple parameter settings for the clustering and investigate the resulting clusters, i.e. prototypical climate phenomena. A central contribution here is the development of design principles for analyzing variability in decadal climate simulations, which has lead to a visualization system centered around the new Clustering Timeline. This is a variant of a Sankey diagram that utilizes clustering results to communicate climatic states over time coupled with ensemble member agreement. It can reveal several interesting properties of the dataset, such as: into how many inherently similar groups the ensemble can be divided at any given time, whether the ensemble diverges in general, whether there are different phases in the time lapse, maybe periodicity, or outliers. The Clustering Timeline is also used to compare multiple climate simulation models and assess their performance. The Hierarchical Clustering Timeline is an advanced version of the above. It introduces the concept of a cluster hierarchy that may group the whole dataset down to the individual static scalar fields into clusters of various sizes and densities recording the nesting relationship between them. One more contribution of this work in terms of visualization research is, that ways are investigated how to practically utilize a hierarchical clustering of time-dependent scalar fields to analyze the data. To this end, a system of different views is proposed which are linked through various interaction possibilities. The main advantage of the system is that a dataset can now be inspected at an arbitrary level of detail without having to recompute a clustering with different parameters. Interesting branches of the simulation can be expanded to reveal smaller differences in critical clusters or folded to show only a coarse representation of the less interesting parts of the dataset. The last building block of the suit of visual analysis methods developed for this thesis aims at a robust, (largely) automatic detection and tracking of certain features in a scalar field ensemble. Techniques are presented that I found can identify and track super- and sub-levelsets. And I derive “centers of action” from these sets which mark the location of extremal climate phenomena that govern the weather (e.g. Icelandic Low and Azores High). The thesis also presents visual and quantitative techniques to evaluate the temporal change of the positions of these centers; such a displacement would be likely to manifest in changes in weather. In a preliminary analysis with my collaborators, we indeed observed changes in the loci of the centers of action in a simulation with increased greenhouse gas concentration as compared to pre-industrial concentration levels

    A Fast and Scalable System to Visualize Contour Gradient from Spatio-temporal Data

    Get PDF
    Changes in geological processes that span over the years may often go unnoticed due to their inherent noise and variability. Natural phenomena such as riverbank erosion, and climate change in general, is invisible to humans unless appropriate measures are taken to analyze the underlying data. Visualization helps geological sciences to generate scientific insights into such long-term geological events. Commonly used approaches such as side-by-side contour plots and spaghetti plots do not provide a clear idea about the historical spatial trends. To overcome this challenge, we propose an image-gradient based approach called ContourDiff. ContourDiff overlays gradient vector over contour plots to analyze the trends of change across spatial regions and temporal domain. Our approach first aggregates for each location, its value differences from the neighboring points over the temporal domain, and then creates a vector field representing the prominent changes. Finally, it overlays the vectors (differential trends) along the contour paths, revealing the differential trends that the contour lines (isolines) experienced over time. We designed an interface, where users can interact with the generated visualization to reveal changes and trends in geospatial data. We evaluated our system using real-life datasets, consisting of millions of data points, where the visualizations were generated in less than a minute in a single-threaded execution. We show the potential of the system in detecting subtle changes from almost identical images, describe implementation challenges, speed-up techniques, and scope for improvements. Our experimental results reveal that ContourDiff can reliably visualize the differential trends, and provide a new way to explore the change pattern in spatiotemporal data. The expert evaluation of our system using real-life WRF (Weather Research and Forecasting) model output reveals the potential of our technique to generate useful insights on the spatio-temporal trends of geospatial variables

    NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation

    Full text link
    Complex computational models are often designed to simulate real-world physical phenomena in many scientific disciplines. However, these simulation models tend to be computationally very expensive and involve a large number of simulation input parameters which need to be analyzed and properly calibrated before the models can be applied for real scientific studies. We propose a visual analysis system to facilitate interactive exploratory analysis of high-dimensional input parameter space for a complex yeast cell polarization simulation. The proposed system can assist the computational biologists, who designed the simulation model, to visually calibrate the input parameters by modifying the parameter values and immediately visualizing the predicted simulation outcome without having the need to run the original expensive simulation for every instance. Our proposed visual analysis system is driven by a trained neural network-based surrogate model as the backend analysis framework. Surrogate models are widely used in the field of simulation sciences to efficiently analyze computationally expensive simulation models. In this work, we demonstrate the advantage of using neural networks as surrogate models for visual analysis by incorporating some of the recent advances in the field of uncertainty quantification, interpretability and explainability of neural network-based models. We utilize the trained network to perform interactive parameter sensitivity analysis of the original simulation at multiple levels-of-detail as well as recommend optimal parameter configurations using the activation maximization framework of neural networks. We also facilitate detail analysis of the trained network to extract useful insights about the simulation model, learned by the network, during the training process.Comment: Published at IEEE Transactions on Visualization and Computer Graphic

    Climate Services Toolbox (CSTools) v4.0: from climate forecasts to climate forecast information

    Get PDF
    Despite the wealth of existing climate forecast data, only a small part is effectively exploited for sectoral applications. A major cause of this is the lack of integrated tools that allow the translation of data into useful and skillful climate information. This barrier is addressed through the development of an R package. Climate Services Toolbox (CSTools) is an easy-to-use toolbox designed and built to assess and improve the quality of climate forecasts for seasonal to multi-annual scales. The package contains process-based, state-of-the-art methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination, and multivariate verification, as well as basic and advanced tools to obtain tailored products. Due to the modular design of the toolbox in individual functions, the users can develop their own post-processing chain of functions, as shown in the use cases presented in this paper, including the analysis of an extreme wind speed event, the generation of seasonal forecasts of snow depth based on the SNOWPACK model, and the post-processing of temperature and precipitation data to be used as input in impact models.This research has been supported by the Horizon 2020 (S2S4E; grant no. 776787), EUCP (grant no. 776613), ERA4CS (grant no. 690462), and the Ministerio de Ciencia e Innovación (grant no. FPI PRE2019-088646).Peer Reviewed"Article signat per 19 autors/es: Núria Pérez-Zanón, Louis-Philippe Caron, Silvia Terzago, Bert Van Schaeybroeck, Llorenç Lledó, Nicolau Manubens, Emmanuel Roulin, M. Carmen Alvarez-Castro, Lauriane Batté , Pierre-Antoine Bretonnière, Susana Corti, Carlos Delgado-Torres, Marta Domínguez, Federico Fabiano, Ignazio Giuntoli, Jost von Hardenberg, Eroteida Sánchez-García, Verónica Torralba, and Deborah Verfaillie"Postprint (published version

    voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data

    Full text link
    Relationships in scientific data, such as the numerical and spatial distribution relations of features in univariate data, the scalar-value combinations' relations in multivariate data, and the association of volumes in time-varying and ensemble data, are intricate and complex. This paper presents voxel2vec, a novel unsupervised representation learning model, which is used to learn distributed representations of scalar values/scalar-value combinations in a low-dimensional vector space. Its basic assumption is that if two scalar values/scalar-value combinations have similar contexts, they usually have high similarity in terms of features. By representing scalar values/scalar-value combinations as symbols, voxel2vec learns the similarity between them in the context of spatial distribution and then allows us to explore the overall association between volumes by transfer prediction. We demonstrate the usefulness and effectiveness of voxel2vec by comparing it with the isosurface similarity map of univariate data and applying the learned distributed representations to feature classification for multivariate data and to association analysis for time-varying and ensemble data.Comment: Accepted by IEEE Transaction on Visualization and Computer Graphics (TVCG

    Investigations of Environmental Effects on Freeway Acoustics

    Get PDF
    abstract: The role of environmental factors that influence atmospheric propagation of sound originating from freeway noise sources is studied with a combination of field experiments and numerical simulations. Acoustic propagation models are developed and adapted for refractive index depending upon meteorological conditions. A high-resolution multi-nested environmental forecasting model forced by coarse global analysis is applied to predict real meteorological profiles at fine scales. These profiles are then used as input for the acoustic models. Numerical methods for producing higher resolution acoustic refractive index fields are proposed. These include spatial and temporal nested meteorological simulations with vertical grid refinement. It is shown that vertical nesting can improve the prediction of finer structures in near-ground temperature and velocity profiles, such as morning temperature inversions and low level jet-like features. Accurate representation of these features is shown to be important for modeling sound refraction phenomena and for enabling accurate noise assessment. Comparisons are made using the acoustic model for predictions with profiles derived from meteorological simulations and from field experiment observations in Phoenix, Arizona. The challenges faced in simulating accurate meteorological profiles at high resolution for sound propagation applications are highlighted and areas for possible improvement are discussed. A detailed evaluation of the environmental forecast is conducted by investigating the Surface Energy Balance (SEB) obtained from observations made with an eddy-covariance flux tower compared with SEB from simulations using several physical parameterizations of urban effects and planetary boundary layer schemes. Diurnal variation in SEB constituent fluxes are examined in relation to surface layer stability and modeled diagnostic variables. Improvement is found when adapting parameterizations for Phoenix with reduced errors in the SEB components. Finer model resolution (to 333 m) is seen to have insignificant (<1σ<1\sigma) influence on mean absolute percent difference of 30-minute diurnal mean SEB terms. A new method of representing inhomogeneous urban development density derived from observations of impervious surfaces with sub-grid scale resolution is then proposed for mesoscale applications. This method was implemented and evaluated within the environmental modeling framework. Finally, a new semi-implicit scheme based on Leapfrog and a fourth-order implicit time-filter is developed.Dissertation/ThesisDoctoral Dissertation Mechanical Engineering 201
    corecore