Parallel Hierarchies: Interactive Visualization of Multidimensional Hierarchical Aggregates

Abstract

Exploring multi-dimensional hierarchical data is a long-standing problem present in a wide range of fields such as bioinformatics, software systems, social sciences and business intelligence. While each hierarchical dimension within these data structures can be explored in isolation, critical information lies in the relationships between dimensions. Existing approaches can either simultaneously visualize multiple non-hierarchical dimensions, or only one or two hierarchical dimensions. Yet, the challenge of visualizing multi-dimensional hierarchical data remains open. To address this problem, we developed a novel data visualization approach -- Parallel Hierarchies -- that we demonstrate on a real-life SAP SE product called SAP Product Lifecycle Costing. The starting point of the research is a thorough customer-driven requirement engineering phase including an iterative design process. To avoid restricting ourselves to a domain-specific solution, we abstract the data and tasks gathered from users, and demonstrate the approach generality by applying Parallel Hierarchies to datasets from bioinformatics and social sciences. Moreover, we report on a qualitative user study conducted in an industrial scenario with 15 experts from 9 different companies. As a result of this co-innovation experience, several SAP customers requested a product feature out of our solution. Moreover, Parallel Hierarchies integration as a standard diagram type into SAP Analytics Cloud platform is in progress. This thesis further introduces different uncertainty representation methods applicable to Parallel Hierarchies and in general to flow diagrams. We also present a visual comparison taxonomy for time-series of hierarchically structured data with one or multiple dimensions. Moreover, we propose several visual solutions for comparing hierarchies employing flow diagrams. Finally, after presenting two application examples of Parallel Hierarchies on industrial datasets, we detail two validation methods to examine the effectiveness of the visualization solution. Particularly, we introduce a novel design validation table to assess the perceptual aspects of eight different visualization solutions including Parallel Hierarchies.:1 Introduction 1.1 Motivation and Problem Statement 1.2 Research Goals 1.3 Outline and Contributions 2 Foundations of Visualization 2.1 Information Visualization 2.1.1 Terms and Definition 2.1.2 What: Data Structures 2.1.3 Why: Visualization Tasks 2.1.4 How: Visualization Techniques 2.1.5 How: Interaction Techniques 2.2 Visual Perception 2.2.1 Visual Variables 2.2.2 Attributes of Preattentive and Attentive Processing 2.2.3 Gestalt Principles 2.3 Flow Diagrams 2.3.1 Classifications of Flow Diagrams 2.3.2 Main Visual Features 2.4 Summary 3 Related Work 3.1 Cross-tabulating Hierarchical Categories 3.1.1 Visualizing Categorical Aggregates of Item Sets 3.1.2 Hierarchical Visualization of Categorical Aggregates 3.1.3 Visualizing Item Sets and Their Hierarchical Properties 3.1.4 Hierarchical Visualization of Categorical Set Aggregates 3.2 Uncertainty Visualization 3.2.1 Uncertainty Taxonomies 3.2.2 Uncertainty in Flow Diagrams 3.3 Time-Series Data Visualization 3.3.1 Time & Data 3.3.2 User Tasks 3.3.3 Visual Representation 3.4 Summary ii Contents 4 Requirement Engineering Phase 4.1 Introduction 4.2 Environment 4.2.1 The Product 4.2.2 The Customers and Development Methodology 4.2.3 Lessons Learned 4.3 Visualization Requirements for Product Costing 4.3.1 Current Visualization Practice 4.3.2 Visualization Tasks 4.3.3 Data Structure and Size 4.3.4 Early Visualization Prototypes 4.3.5 Challenges and Lessons Learned 4.4 Data and Task Abstraction 4.4.1 Data Abstraction 4.4.2 Task Abstraction 4.5 Summary and Outlook 5 Parallel Hierarchies 5.1 Introduction 5.2 The Parallel Hierarchies Technique 5.2.1 The Individual Axis: Showing Hierarchical Categories 5.2.2 Two Interlinked Axes: Showing Pairwise Frequencies 5.2.3 Multiple Linked Axes: Propagating Frequencies 5.2.4 Fine-tuning Parallel Hierarchies through Reordering 5.3 Design Choices 5.4 Applying Parallel Hierarchies 5.4.1 US Census Data 5.4.2 Yeast Gene Ontology Annotations 5.5 Evaluation 5.5.1 Setup of the Evaluation 5.5.2 Procedure of the Evaluation 5.5.3 Results from the Evaluation 5.5.4 Validity of the Evaluation 5.6 Summary and Outlook 6 Visualizing Uncertainty in Flow Diagrams 6.1 Introduction 6.2 Uncertainty in Product Costing 6.2.1 Background 6.2.2 Main Causes of Bad Quality in Costing Data 6.3 Visualization Concepts 6.4 Uncertainty Visualization using Ribbons 6.4.1 Selected Visualization Techniques 6.4.2 Study Design and Procedure 6.4.3 Results 6.4.4 Discussion 6.5 Revised Visualization Approach using Ribbons 6.5.1 Application to Sankey Diagram 6.5.2 Application to Parallel Sets 6.5.3 Application to Parallel Hierarchies 6.6 Uncertainty Visualization using Nodes 6.6.1 Visual Design of Nodes 6.6.2 Expert Evaluation 6.7 Summary and Outlook 7 Visual Comparison Task 7.1 Introduction 7.2 Comparing Two One-dimensional Time Steps 7.2.1 Problem Statement 7.2.2 Visualization Design 7.3 Comparing Two N-dimensional Time Steps 7.4 Comparing Several One-dimensional Time Steps 7.5 Summary and Outlook 8 Parallel Hierarchies in Practice 8.1 Application to Plausibility Check Task 8.1.1 Plausibility Check Process 8.1.2 Visual Exploration of Machine Learning Results 8.2 Integration into SAP Analytics Cloud 8.2.1 SAP Analytics Cloud 8.2.2 Ocean to Table Project 8.3 Summary and Outlook 9 Validation 9.1 Introduction 9.2 Nested Model Validation Approach 9.3 Perceptual Validation of Visualization Techniques 9.3.1 Design Validation Table 9.3.2 Discussion 9.4 Summary and Outlook 10 Conclusion and Outlook 10.1 Summary of Findings 10.2 Discussion 10.3 Outlook A Questionnaires of the Evaluation B Survey of the Quality of Product Costing Data C Questionnaire of Current Practice Bibliograph

    Similar works