14 research outputs found

    Hybrid visualizations for data exploration

    Get PDF
    Information Visualization (Infovis) graphically encodes information to help a user explore a data set visually and interactively. This graphical encoding can take the form of widespread visualizations such as bar charts and scatterplots. Multiple visualizations can share the same functional space to form complete tools for visual exploration or for communicating information. There is multiple ways of combining these visualizations. The assembly of multiple visualizations can give some complex assemblies sometimes called hybrid visualizations. A hybrid visualization is the result of assembling multiple simpler visualizations. For example, NodeTrix (Henry et al., 2007a) is composed of a node-link diagram and an adjacency matrix, and MatLink (Henry and Fekete, 2007a) adds arc links to an adjacency matrix. This integration of multiple visualizations can be a way to combine their advantages into a coherent structure. The integration can be achieved, for example, through color coding, or through explicit linking (such as with arrows), or through interaction (such as when different visualizations respond to the manipulation of others). Recent literature contains several examples of new hybrid visualizations, most often to deal with complex datasets where the user can benefit from multiple, complementary visual encodings of the same data. However, to date, there is almost no theory or framework to help researchers understand and characterize existing hybrids or design new ones. This thesis advances the state of the art in hybrid visualizations in two ways: first, by developing a framework that defines and characterizes hybrid visualizations to help better identify, describe and design them, and second, by demonstrating a variety of novel hybrids. The hybrid visualizations we explored cover a wide range of possibilities. Two of the most general and widely used data types in Infovis, multidimensional multivariate data and graph (i.e., network) data, are each the subject of a chapter in the thesis, with novel hybrid visualization techniques presented for each. A wide range of possibilities for integration is also presented using a pipeline model. After some preliminary material, chapter 2 of the thesis presents a conceptual framework that defines and characterizes hybrid visualizations. This framework was itself derived from experience designing the hybrid visualizations presented in the subsequent chapters. A hybrid visualization is described as a graphical encoding using other visualizations as building blocks. We present a pipeline to illustrate the assembly of a visualization, starting from the generation of basic shapes or glyphs, then placed on a layout, embellished by adding other graphical elements, then sent to some view transform operators and assembled on the same space. Simple charts can be described with this pipeline as well as more complex assembly and new hybrids are described. Chapter 3 presents ConnectedCharts, an example of a hybrid assembled on the assembly level of the pipeline, made of multiple multidimensional and multivariate charts explicitly connected by lines or curves showing the relationship between their elements. A user interface enables the interactive assembly of ConnectedCharts, including a wide range of previously-published hybrid visualizations, as well as novel hybrid arrangements. ConnectedCharts serve as an illustration of the conceptual framework in chapter 2, by exploring possible connections between different graphics depending on the relationship of their encoded data types. Chapter 4 presents another user interface, this time for graph exploration, that incorporates several highly integrated hybrid visualizations. A Parallel Scatter Plot Matrix (P-SPLOM) is presented that constitutes a fusion of a Scatter Plot Matrix (SPLOM) and a Parallel Coordinates Plot (PCP). A radial menu called the FlowVizMenu enables the modification of a visualization integrated at the center of the menu. This menu is also used to select the dimensions for configuring a third hybrid based on an Attribute-Driven Layout (ADL) that combines a nodelink diagram and a scatterplot. The characterization of hybrid visualizations offered by the conceptual framework, as well as the illustration of the framework by innovative hybrid visualizations, are the main contributions of this thesis to the Infovis community

    Explanatory visualization of multidimensional projections

    Get PDF

    Visual Integration of Data and Model Space in Ensemble Learning

    Full text link
    Ensembles of classifier models typically deliver superior performance and can outperform single classifier models given a dataset and classification task at hand. However, the gain in performance comes together with the lack in comprehensibility, posing a challenge to understand how each model affects the classification outputs and where the errors come from. We propose a tight visual integration of the data and the model space for exploring and combining classifier models. We introduce a workflow that builds upon the visual integration and enables the effective exploration of classification outputs and models. We then present a use case in which we start with an ensemble automatically selected by a standard ensemble selection algorithm, and show how we can manipulate models and alternative combinations.Comment: 8 pages, 7 picture

    Self-supervised Dimensionality Reduction with Neural Networks and Pseudo-labeling

    Get PDF
    Dimensionality reduction (DR) is used to explore high-dimensional data in many applications. Deep learning techniques such as autoencoders have been used to provide fast, simple to use, and high-quality DR. However, such methods yield worse visual cluster separation than popular methods such as t-SNE and UMAP. We propose a deep learning DR method called Self-Supervised Network Projection (SSNP) which does DR based on pseudo-labels obtained from clustering. We show that SSNP produces better cluster separation than autoencoders, has out-of-sample, inverse mapping, and clustering capabilities, and is very fast and easy to use.</p

    Doctor of Philosophy

    Get PDF
    dissertationWith the ever-increasing amount of available computing resources and sensing devices, a wide variety of high-dimensional datasets are being produced in numerous fields. The complexity and increasing popularity of these data have led to new challenges and opportunities in visualization. Since most display devices are limited to communication through two-dimensional (2D) images, many visualization methods rely on 2D projections to express high-dimensional information. Such a reduction of dimension leads to an explosion in the number of 2D representations required to visualize high-dimensional spaces, each giving a glimpse of the high-dimensional information. As a result, one of the most important challenges in visualizing high-dimensional datasets is the automatic filtration and summarization of the large exploration space consisting of all 2D projections. In this dissertation, a new type of algorithm is introduced to reduce the exploration space that identifies a small set of projections that capture the intrinsic structure of high-dimensional data. In addition, a general framework for summarizing the structure of quality measures in the space of all linear 2D projections is presented. However, identifying the representative or informative projections is only part of the challenge. Due to the high-dimensional nature of these datasets, obtaining insights and arriving at conclusions based solely on 2D representations are limited and prone to error. How to interpret the inaccuracies and resolve the ambiguity in the 2D projections is the other half of the puzzle. This dissertation introduces projection distortion error measures and interactive manipulation schemes that allow the understanding of high-dimensional structures via data manipulation in 2D projections
    corecore