Search CORE

23,038 research outputs found

대용량 데이터 탐색을 위한 점진적 시각화 시스템 설계

Author: 조재민
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2020. 2. 서진욱.Understanding data through interactive visualization, also known as visual analytics, is a common and necessary practice in modern data science. However, as data sizes have increased at unprecedented rates, the computation latency of visualization systems becomes a significant hurdle to visual analytics. The goal of this dissertation is to design a series of systems for progressive visual analytics (PVA)—a visual analytics paradigm that can provide intermediate results during computation and allow visual exploration of these results—to address the scalability hurdle. To support the interactive exploration of data with billions of records, we first introduce SwiftTuna, an interactive visualization system with scalable visualization and computation components. Our performance benchmark demonstrates that it can handle data with four billion records, giving responsive feedback every few seconds without precomputation. Second, we present PANENE, a progressive algorithm for the Approximate k-Nearest Neighbor (AKNN) problem. PANENE brings useful machine learning methods into visual analytics, which has been challenging due to their long initial latency resulting from AKNN computation. In particular, we accelerate t-Distributed Stochastic Neighbor Embedding (t-SNE), a popular non-linear dimensionality reduction technique, which enables the responsive visualization of data with a few hundred columns. Each of these two contributions aims to address the scalability issues stemming from a large number of rows or columns in data, respectively. Third, from the users' perspective, we focus on improving the trustworthiness of intermediate knowledge gained from uncertain results in PVA. We propose a novel PVA concept, Progressive Visual Analytics with Safeguards, and introduce PVA-Guards, safeguards people can leave on uncertain intermediate knowledge that needs to be verified. We also present a proof-of-concept system, ProReveal, designed and developed to integrate seven safeguards into progressive data exploration. Our user study demonstrates that people not only successfully created PVA-Guards on ProReveal but also voluntarily used PVA-Guards to manage the uncertainty of their knowledge. Finally, summarizing the three studies, we discuss design challenges for progressive systems as well as future research agendas for PVA.현대 데이터 사이언스에서 인터랙티브한 시각화를 통해 데이터를 이해하는 것은 필수적인 분석 방법 중 하나이다. 그러나, 최근 데이터의 크기가 폭발적으로 증가하면서 데이터 크기로 인해 발생하는 지연 시간이 인터랙티브한 시각적 분석에 큰 걸림돌이 되었다. 본 연구에서는 이러한 확장성 문제를 해결하기 위해 점진적 시각적 분석(Progressive Visual Analytics)을 지원하는 일련의 시스템을 디자인하고 개발한다. 이러한 점진적 시각적 분석 시스템은 데이터 처리가 완전히 끝나지 않더라도 중간 분석 결과를 사용자에게 제공함으로써 데이터의 크기로 인해 발생하는 지연 시간 문제를 완화할 수 있다. 첫째로, 수십억 건의 행을 가지는 데이터를 시각적으로 탐색할 수 있는 SwiftTuna 시스템을 제안한다. 데이터 처리 및 시각적 표현의 확장성을 목표로 개발된 이 시스템은, 약 40억 건의 행을 가진 데이터에 대한 시각화를 전처리 없이 수 초마다 업데이트할 수 있는 것으로 나타났다. 둘째로, 근사적 k-최근접점(Approximate k-Nearest Neighbor) 문제를 점진적으로 계산하는 PANENE 알고리즘을 제안한다. 근사적 k-최근접점 문제는 여러 기계 학습 기법에서 쓰임에도 불구하고 초기 계산 시간이 길어서 인터랙티브한 시스템에 적용하기 힘든 한계가 있었다. PANENE 알고리즘은 이러한 긴 초기 계산 시간을 획기적으로 개선하여 다양한 기계 학습 기법을 시각적 분석에 활용할 수 있도록 한다. 특히, 유용한 비선형적 차원 감소 기법인 t-분포 확률적 임베딩(t-Distributed Stochastic Neighbor Embedding)을 가속하여 수백 개의 차원을 가지는 데이터를 빠른 시간 내에 사영할 수 있다. 위의 두 시스템과 알고리즘이 데이터의 행 또는 열의 개수로 인한 확장성 문제를 해결하고자 했다면, 세 번째 시스템에서는 점진적 시각적 분석의 신뢰도 문제를 개선하고자 한다. 점진적 시각적 분석에서 사용자에게 주어지는 중간 계산 결과는 최종 결과의 근사치이므로 불확실성이 존재한다. 본 연구에서는 세이프가드를 이용한 점진적 시각적 분석(Progressive Visual Analytics with Safeguards)이라는 새로운 개념을 제안한다. 이 개념은 사용자가 점진적 탐색에서 마주하는 불확실한 중간 지식에 세이프가드를 남길 수 있도록 하여 탐색에서 얻은 지식의 정확도를 추후 검증할 수 있도록 한다. 또한, 이러한 개념을 실제로 구현하여 탑재한 ProReveal 시스템을 소개한다. ProReveal를 이용한 사용자 실험에서 사용자들은 세이프가드를 성공적으로 만들 수 있었을 뿐만 아니라, 중간 지식의 불확실성을 다루기 위해 세이프가드를 자발적으로 이용한다는 것을 알 수 있었다. 마지막으로, 위 세 가지 연구의 결과를 종합하여 점진적 시각적 분석 시스템을 구현할 때의 디자인적 난제와 향후 연구 방향을 모색한다.CHAPTER1. Introduction 2 1.1 Background and Motivation 2 1.2 Thesis Statement and Research Questions 5 1.3 Thesis Contributions 5 1.3.1 Responsive and Incremental Visual Exploration of Large-scale Multidimensional Data 6 1.3.2 ProgressiveComputation of Approximate k-Nearest Neighbors and Responsive t-SNE 7 1.3.3 Progressive Visual Analytics with Safeguards 8 1.4 Structure of Dissertation 9 CHAPTER2. Related Work 11 2.1 Progressive Visual Analytics 11 2.1.1 Definitions 11 2.1.2 System Latency and Human Factors 13 2.1.3 Users, Tasks, and Models 15 2.1.4 Techniques, Algorithms, and Systems. 17 2.1.5 Uncertainty Visualization 19 2.2 Approaches for Scalable Visualization Systems 20 2.3 The k-Nearest Neighbor (KNN) Problem 22 2.4 t-Distributed Stochastic Neighbor Embedding 26 CHAPTER3. SwiTuna: Responsive and Incremental Visual Exploration of Large-scale Multidimensional Data 28 3.1 The SwiTuna Design 31 3.1.1 Design Considerations 32 3.1.2 System Overview 33 3.1.3 Scalable Visualization Components 36 3.1.4 Visualization Cards 40 3.1.5 User Interface and Interaction 42 3.2 Responsive Querying 44 3.2.1 Querying Pipeline 44 3.2.2 Prompt Responses 47 3.2.3 Incremental Processing 47 3.3 Evaluation: Performance Benchmark 49 3.3.1 Study Design 49 3.3.2 Results and Discussion 52 3.4 Implementation 56 3.5 Summary 56 CHAPTER4. PANENE:AProgressive Algorithm for IndexingandQuerying Approximate k-Nearest Neighbors 58 4.1 Approximate k-Nearest Neighbor 61 4.1.1 A Sequential Algorithm 62 4.1.2 An Online Algorithm 63 4.1.3 A Progressive Algorithm 66 4.1.4 Filtered AKNN Search 71 4.2 k-Nearest Neighbor Lookup Table 72 4.3 Benchmark. 78 4.3.1 Online and Progressive k-d Trees 78 4.3.2 k-Nearest Neighbor Lookup Tables 83 4.4 Applications 85 4.4.1 Progressive Regression and Density Estimation 85 4.4.2 Responsive t-SNE 87 4.5 Implementation 92 4.6 Discussion 92 4.7 Summary 93 CHAPTER5. ProReveal: Progressive Visual Analytics with Safeguards 95 5.1 Progressive Visual Analytics with Safeguards 98 5.1.1 Definition 98 5.1.2 Examples 101 5.1.3 Design Considerations 103 5.2 ProReveal 105 5.3 Evaluation 121 5.4 Discussion 127 5.5 Summary 130 CHAPTER6. Discussion 132 6.1 Lessons Learned 132 6.2 Limitations 135 CHAPTER7. Conclusion 137 7.1 Thesis Contributions Revisited 137 7.2 Future Research Agenda 139 7.3 Final Remarks 141 Abstract (Korean) 155 Acknowledgments (Korean) 157Docto

SNU Open Repository and Archive

Visual analytics for supply network management: system design and evaluation

Author: Basole Rahul C.
Bellamy Marcus A.
Park Hyunwoo
Publication venue: 'Elsevier BV'
Publication date: 01/11/2016
Field of study

We propose a visual analytic system to augment and enhance decision-making processes of supply chain managers. Several design requirements drive the development of our integrated architecture and lead to three primary capabilities of our system prototype. First, a visual analytic system must integrate various relevant views and perspectives that highlight different structural aspects of a supply network. Second, the system must deliver required information on-demand and update the visual representation via user-initiated interactions. Third, the system must provide both descriptive and predictive analytic functions for managers to gain contingency intelligence. Based on these capabilities we implement an interactive web-based visual analytic system. Our system enables managers to interactively apply visual encodings based on different node and edge attributes to facilitate mental map matching between abstract attributes and visual elements. Grounded in cognitive fit theory, we demonstrate that an interactive visual system that dynamically adjusts visual representations to the decision environment can significantly enhance decision-making processes in a supply network setting. We conduct multi-stage evaluation sessions with prototypical users that collectively confirm the value of our system. Our results indicate a positive reaction to our system. We conclude with implications and future research opportunities.The authors would like to thank the participants of the 2015 Businessvis Workshop at IEEE VIS, Prof. Benoit Montreuil, and Dr. Driss Hakimi for their valuable feedback on an earlier version of the software; Prof. Manpreet Hora for assisting with and Georgia Tech graduate students for participating in the evaluation sessions; and the two anonymous reviewers for their detailed comments and suggestions. The study was in part supported by the Tennenbaum Institute at Georgia Tech Award # K9305. (K9305 - Tennenbaum Institute at Georgia Tech Award)Accepted manuscrip

Boston University Institutional Repository (OpenBU)

Using visual analytics to develop situation awareness in astrophysics

Author: Aldering Gregory S.
Aragon Cecilia R.
Poon Sarah S.
Quimby Robert
Thomas Rollin C.
Publication venue: Palgrave Macmillan
Publication date: 01/01/2009
Field of study

We present a novel collaborative visual analytics application for cognitively overloaded users in the astrophysics domain. The system was developed for scientists who need to analyze heterogeneous, complex data under time pressure, and make predictions and time-critical decisions rapidly and correctly under a constant influx of changing data. The Sunfall Data Taking system utilizes several novel visualization and analysis techniques to enable a team of geographically distributed domain specialists to effectively and remotely maneuver a custom-built instrument under challenging operational conditions. Sunfall Data Taking has been in production use for 2 years by a major international astrophysics collaboration (the largest data volume supernova search currently in operation), and has substantially improved the operational efficiency of its users. We describe the system design process by an interdisciplinary team, the system architecture and the results of an informal usability evaluation of the production system by domain experts in the context of Endsley's three levels of situation awareness

Caltech Authors

DPVis: Visual Analytics with Hidden Markov Models for Disease Progression Pathways

Author: Anand Vibha
Frohnert Brigitte I
Ghosh Soumya
Kwon Bum Chul
Lundgren Markus
Ng Kenney
Severson Kristen A
Sun Zhaonan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/04/2020
Field of study

Clinical researchers use disease progression models to understand patient status and characterize progression patterns from longitudinal health records. One approach for disease progression modeling is to describe patient status using a small number of states that represent distinctive distributions over a set of observed measures. Hidden Markov models (HMMs) and its variants are a class of models that both discover these states and make inferences of health states for patients. Despite the advantages of using the algorithms for discovering interesting patterns, it still remains challenging for medical experts to interpret model outputs, understand complex modeling parameters, and clinically make sense of the patterns. To tackle these problems, we conducted a design study with clinical scientists, statisticians, and visualization experts, with the goal to investigate disease progression pathways of chronic diseases, namely type 1 diabetes (T1D), Huntington's disease, Parkinson's disease, and chronic obstructive pulmonary disease (COPD). As a result, we introduce DPVis which seamlessly integrates model parameters and outcomes of HMMs into interpretable and interactive visualizations. In this study, we demonstrate that DPVis is successful in evaluating disease progression models, visually summarizing disease states, interactively exploring disease progression patterns, and building, analyzing, and comparing clinically relevant patient subgroups.Comment: to appear at IEEE Transactions on Visualization and Computer Graphic

arXiv.org e-Print Archive

Lund University Publications

A Review and Characterization of Progressive Visual Analytics

Author: Angelini Marco
Santucci Giuseppe
Schulz Hans-Jörg
Schumann Heidrun
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Progressive Visual Analytics (PVA) has gained increasing attention over the past years. It brings the user into the loop during otherwise long-running and non-transparent computations by producing intermediate partial results. These partial results can be shown to the user for early and continuous interaction with the emerging end result even while it is still being computed. Yet as clear-cut as this fundamental idea seems, the existing body of literature puts forth various interpretations and instantiations that have created a research domain of competing terms, various definitions, as well as long lists of practical requirements and design guidelines spread across different scientific communities. This makes it more and more difficult to get a succinct understanding of PVA’s principal concepts, let alone an overview of this increasingly diverging field. The review and discussion of PVA presented in this paper address these issues and provide (1) a literature collection on this topic, (2) a conceptual characterization of PVA, as well as (3) a consolidated set of practical recommendations for implementing and using PVA-based visual analytics solutions

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Recommended from our members

Towards a Theory of Analytical Behaviour: A Model of Decision-Making in Visual Analytics

Author: Booth P.
Galanis S.
Gibbins N.
Publication venue: University of Hawaii at Manoa
Publication date: 01/01/2019
Field of study

This paper introduces a descriptive model of the human-computer processes that lead to decision-making in visual analytics. A survey of nine models from the visual analytics and HCI literature are presented to account for different perspectives such as sense-making, reasoning, and low-level human-computer interactions. The survey examines the people and computers (entities) presented in the models, the divisions of labour between entities (both physical and role-based), the behaviour of both people and machines as constrained by their roles and agency, and finally the elements and processes which define the flow of data both within and between entities. The survey informs the identification of four observations that characterise analytical behaviour - defined as decision-making facilitated by visual analytics: bilateral discourse, divisions of labour, mixed-synchronicity information flows, and bounded behaviour. Based on these principles, a descriptive model is presented as a contribution towards a theory of analytical behaviour. The future intention is to apply prospect theory, a economic model of decision-making under uncertainty, to the study of analytical behaviour. It is our assertion that to apply prospect theory first requires a descriptive model of the processes that facilitate decision-making in visual analytics. We conclude it necessary to measure the perception of risk in future work in order to apply prospect theory to the study of analytical behaviour using our proposed model

City Research Online

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)