3 research outputs found

    Effective Visualization Approaches For Ultra-High Dimensional Datasets

    Get PDF
    Multivariate informational data, which are abstract as well as complex, are becoming increasingly common in many areas such as scientific, medical, social, business, and so on. Displaying and analyzing large amounts of multivariate data with more than three variables of different types is quite challenging. Visualization of such multivariate data suffers from a high degree of clutter when the numbers of dimensions/variables and data observations become too large. We propose multiple approaches to effectively visualize large datasets of ultrahigh number of dimensions by generalizing two standard multivariate visualization methods, namely star plot and parallel coordinates plot. We refine three variants of the star plot, which include overlapped star plot, shifted origin plot, and multilevel star plot by embedding distribution plots, displaying dataset in groups, and supporting adjustable positioning of the star axes. We introduce a bifocal parallel coordinates plot (BPCP) based on the focus + context approach. BPCP splits vertically the overall rendering area into the focus and context regions. The focus area maps a few selected dimensions of interest at sufficiently wide spacing. The remaining dimensions are represented in the context area in a compact way to retain useful information and provide the data continuity. The focus display can be further enriched with various options, such as axes overlays, scatterplot, and nested PCPs. In order to accommodate an arbitrarily large number of dimensions, the context display supports the multi-level stacked view. Finally, we present two innovative ways of enhancing parallel coordinates axes to better understand all variables and their interrelationships in high-dimensional datasets. Histogram and circle/ellipse plots based on uniform and non-uniform frequency/density mappings are adopted to visualize distributions of numerical and categorical data values. Color-mapped axis stripes are designed in the parallel coordinates layout so that correlations can be fully realized in the same display plot irrespective of axes locations. These colors are also propagated to histograms as stacked bars and categorical values as pie charts to further facilitate data exploration. By using the datasets consisting of 25 to 130 variables of different data types we have demonstrated effectiveness of the proposed multivariate visualization enhancements

    Explorative coastal oceanographic visual analytics : oceans of data

    Get PDF
    The widely acknowledged challenge to data analysis and understanding, resulting from the exponential increase in volumes of data generated by increasingly complex modelling and sampling systems, is a problem experienced by many researchers, including ocean scientists. The thesis explores a visualization and visual analytics solution for predictive studies of coastal shelf and estuarine modelled, hydrodynamics undertaken to understand sea level rise, as a contribution to wider climate change studies, and to underpin coastal zone planning, flood prevention and extreme event management. But these studies are complex and require numerous simulations of estuarine hydrodynamics, generating extremely large datasets of multi-field data. This type\ud of data is acknowledged as difficult to visualize and analyse, as its numerous attributes present significant computational challenges, and ideally require a wide range of approaches to provide the necessary insight. These challenges are not easily overcome with the current visualization and analysis methodologies employed by coastal shelf hydrodynamic researchers, who use several software systems to generate graphs, each taking considerable time to operate, thus it is difficult to explore different scenarios and explore the data interactively and visually. The thesis, therefore, develops novel visualization and visual analytics techniques to help researchers overcome the limitations of existing methods (for example in understanding key tidal components); analyse data in a timely manner and explore different scenarios. There were a number of challenges to this: the size of the data, resulting in lengthy computing time, also many data values becoming plotted on one pixel (overplotting). The thesis presents: (1) a new visualization framework (VINCA) using caching and hierarchical aggregation techniques to make the data more interactive, plus explorative, coordinated multiple views, to enable the scientists to explore the data. (2) A novel estuarine transect profiler and flux tool, which provides instantaneous flux calculations across an estuary. Measures of flux are of great significance in oceanographic studies, yet are notoriously difficult and time consuming to calculate with the commonly used tools. This derived data is added back into the database for further investigation and analysis. (3) New views, including a novel, dynamic, spatially aggregated Parallel Coordinate Plots (Sa-PCP), are developed to provide different perspectives of the spatial, time dependent data, also methodologies for developing high-quality (journal ready) output from the visualization tool. Finally, (4) the dissertation explored the use of hierarchical data-structures and caching techniques to enable fast analysis on a desktop computer and to overcome the overplotting challenge for this data
    corecore