14 research outputs found

    An Information-Theoretic Framework for Evaluating Edge Bundling Visualization

    Get PDF
    Edge bundling is a promising graph visualization approach to simplifying the visual result of a graph drawing. Plenty of edge bundling methods have been developed to generate diverse graph layouts. However, it is difficult to defend an edge bundling method with its resulting layout against other edge bundling methods as a clear theoretic evaluation framework is absent in the literature. In this paper, we propose an information-theoretic framework to evaluate the visual results of edge bundling techniques. We first illustrate the advantage of edge bundling visualizations for large graphs, and pinpoint the ambiguity resulting from drawing results. Second, we define and quantify the amount of information delivered by edge bundling visualization from the underlying network using information theory. Third, we propose a new algorithm to evaluate the resulting layouts of edge bundling using the amount of the mutual information between a raw network dataset and its edge bundling visualization. Comparison examples based on the proposed framework between different edge bundling techniques are presented


    Get PDF
    A tremendous increase in the scale of graphs has been witnessed in a wide range of fields, which demands efficient and effective visualization techniques to assist users in better understandings of large graphs. Conventional node-link diagrams are often used to visualize graphs, whereas excessive edge crossings can easily incur severe visual clutter in the node-link diagram of a large graph. Edge bundling can effectively remedy visual clutter and reveal high-level graph structures. Although significant efforts have been devoted to developing edge bundling, three challenging problems remain. First, edge bundling techniques are often computationally expensive and are not easy to deploy for web-based applications. The state-of-the-art edge bundling methods often require special system supports and techniques such as high-end GPU acceleration for large graphs, which makes these methods less portable, especially for ubiquitous mobile devices. Second, the quantitative quality of edge bundling results is barely assessed in the literature. Currently, the comparison of edge bundling mainly focuses on computational performance and perceptual results. Third, although the family of edge bundling techniques has a rich set of bundling layout, there is a lack of a generic method to generate different styles of edge bundling. In this research, I aim to address these problems and have made the following contributions. First, I provide an efficient framework to deploy edge bundling for web-based platforms by exploiting standard graphics hardware functions and libraries. My framework can generate high-quality edge bundling results on web-based platforms, and achieve a speedup of 50X compared to the previous state-of-the-art edge bundling method on a graph with half of a million edges. Second, I propose a new moving least squares based approach to lower the algorithm complexity of edge bundling. In addition, my approach can generate better bundling results compared to other methods based on a quality metric. Third, I provide an information-theoretic metric to evaluate the edge bundling methods. I leverage information theory in this metric. With my information-theoretic metric, domain users can choose appropriate edge bundling methods with proper parameters for their applications. Last but not least, I present a deep learning framework for edge bundling visualizations. Through a training process that learns the results of a specific edge bundling method, my deep learning framework can infer the final layout of the edge bundling method. My deep learning framework is a generic framework that can generate the corresponding results of different edge bundling methods. Adviser: Hongfeng Y

    Explanatory visualization of multidimensional projections

    Get PDF

    Visualizing multidimensional data similarities:Improvements and applications

    Get PDF
    Multidimensional data is increasingly more prominent and important in many application domains. Such data typically consist of a large set of elements, each of which described by several measurements (dimensions). During the design of techniques and tools to process this data, a key component is to gather insights into their structure and patterns, which can be described by the notion of similarity between elements. Among these techniques, multidimensional projections and similarity trees can effectively capture similarity patterns and handle a large number of data elements and dimensions. However, understanding and interpreting these patterns in terms of the original data dimensions is still hard. This thesis addresses the development of visual explanatory techniques for the easy interpretation of similarity patterns present in multidimensional projections and similarity trees, by several contributions. First, we propose methods that make the computation of similarity trees efficient for large datasets, and also enhance its visual representation to allow the exploration of more data in a limited screen. Secondly, we propose methods for the visual explanation of multidimensional projections in terms of groups of similar elements. These are automatically annotated to describe which dimensions are more important to define their notion of group similarity. We show next how these explanatory mechanisms can be adapted to handle both static and time-dependent data. Our proposed techniques are designed to be easy to use, work nearly automatically, and are demonstrated on a variety of real-world large data obtained from image collections, text archives, scientific measurements, and software engineering

    Visualisation Support for Biological Bayesian Network Inference

    Get PDF
    Extracting valuable information from the visualisation of biological data and turning it into a network model is the main challenge addressed in this thesis. Biological networks are mathematical models that describe biological entities as nodes and their relationships as edges. Because they describe patterns of relationships, networks can show how a biological system works as a whole. However, network inference is a challenging optimisation problem impossible to resolve computationally in polynomial time. Therefore, the computational biologists (i.e. modellers) combine clustering and heuristic search algorithms with their tacit knowledge to infer networks. Visualisation can play an important role in supporting them in their network inference workflow. The main research question is: “How can visualisation support modellers in their workflow to infer networks from biological data?” To answer this question, it was required to collaborate with computational biologists to understand the challenges in their workflow and form research questions. Following the nested model methodology helped to characterise the domain problem, abstract data and tasks, design effective visualisation components and implement efficient algorithms. Those steps correspond to the four levels of the nested model for collaborating with domain experts to design effective visualisations. We found that visualisation can support modellers in three steps of their workflow. (a) To select variables, (b) to infer a consensus network and (c) to incorporate information about its dynamics.To select variables (a), modellers first apply a hierarchical clustering algorithm which produces a dendrogram (i.e. a tree structure). Then they select a similarity threshold (height) to cut the tree so that branches correspond to clusters. However, applying a single-height similarity threshold is not effective for clustering heterogeneous multidimensional data because clusters may exist at different heights. The research question is: Q1 “How to provide visual support for the effective hierarchical clustering of many multidimensional variables?” To answer this question, MLCut, a novel visualisation tool was developed to enable the application of multiple similarity thresholds. Users can interact with a representation of the dendrogram, which is coordinated with a view of the original multidimensional data, select branches of the tree at different heights and explore different clustering scenarios. Using MLCut in two case studies has shown that this method provides transparency in the clustering process and enables the effective allocation of variables into clusters.Selected variables and clusters constitute nodes in the inferred network. In the second step (b), modellers apply heuristic search algorithms which sample a solution space consisting of all possible networks. The result of each execution of the algorithm is a collection of high-scoring Bayesian networks. The task is to guide the heuristic search and help construct a consensus network. However, this is challenging because many network results contain different scores produced by different executions of the algorithm. The research question is: Q2 “How to support the visual analysis of heuristic search results, to infer representative models for biological systems?” BayesPiles, a novel interactive visual analytics tool, was developed and evaluated in three case studies to support modellers explore, combine and compare results, to understand the structure of the solution space and to construct a consensus network.As part of the third step (c), when the biological data contain measurements over time, heuristics can also infer information about the dynamics of the interactions encoded as different types of edges in the inferred networks. However, representing such multivariate networks is a challenging visualisation problem. The research question is: Q3 “How to effectively represent information related to the dynamics of biological systems, encoded in the edges of inferred networks?” To help modellers explore their results and to answer Q3, a human-centred crowdsourcing experiment took place to evaluate the effectiveness of four visual encodings for multiple edge types in matrices. The design of the tested encodings combines three visual variables: position, orientation, and colour. The study showed that orientation outperforms position and that colour is helpful in most tasks. The results informed an extension to the design of BayePiles, which modellers evaluated exploring dynamic Bayesian networks. The feedback of most participants confirmed the results of the crowdsourcing experiment.This thesis focuses on the investigation, design, and application of visualisation approaches for gaining insights from biological data to infer network models. It shows how visualisation can help modellers in their workflow to select variables, to construct representative network models and to explore their different types of interactions, contributing in gaining a better understanding of how biological processes within living organisms work

    Multidimensional projections for the visual exploration of multimedia data

    Get PDF
    Multidimensional data analysis is considerably important when dealing with such large and complex datasets. Among the possibilities when analyzing such kind of data, applying visualization techniques can help the user find and understand patters, trends and establish new goals. This thesis aims to present several visualization methods to interactively explore multidimensional datasets aimed from specialized to casual users, by making use of both static and dynamic representations created by multidimensional projections

    Explanatory visualization of multidimensional projections

    Get PDF
    Het verkrijgen van inzicht in grote gegevensverzalelingen (tegenwoording bekend als ‘big data’) kan gedaan worden door ze visueel af te beelden en deze visualisaties vervolgens interactief exploreren. Toch kunnen beide het aantal datapunten of metingen, en ook het aantal dimensies die elke meting beschrijven, zeer groot zijn – zoals een table met veel rijen en kolommen. Het visualiseren van dergelijke zogenaamde hoog-dimensionale datasets is zeer uitdagend. Een manier om dit te doen is door het maken van een laag (twee of drie) dimensionale afbeelding, waarin men dan zoekt naar interessante datapatronen in plaats van deze te zoeken in de oorspronkelijke hoog-dimensionale data. Technieken die dit scenario ondersteunen, de zogenaamde projecties, hebben verschillende voordelen – ze zijn visueel schaalbaar, ze werken robuust met ruizige data, en ze zijn snel. Toch is het gebruik van projecties ernstig beperkt door het feit dat ze moeilijk te interpreteren zijn. We benaderen dit problem door verschillende technieken te ontwikkelen die de interpretative vergemakkelijken, zoals het weergeven van projectiefouten en het uitleggen van projecties door middel van de oorpronkelijke hoge dimensies. Onze technieken zijn makkelijk te leren, snel te rekenen, en makkelijk toe te voegen aan elke dataexploratiescenario dat gebruik maakt van elke projectie. We demonstreren onze oplossingen met verschillende toepassingen en data van metingen, wetenschappelijke simulaties, software-engineering, en netwerken