27 research outputs found
What Exactly is an Insight? A Literature Review
Insights are often considered the ideal outcome of visual analysis sessions.
However, there is no single definition of what an insight is. Some scholars
define insights as correlations, while others define them as hypotheses or aha
moments. This lack of a clear definition can make it difficult to build
visualization tools that effectively support insight discovery. In this paper,
we contribute a comprehensive literature review that maps the landscape of
existing insight definitions. We summarize key themes regarding how insight is
defined, with the goal of helping readers identify which definitions of insight
align closely with their research and tool development goals. Based on our
review, we also suggest interesting research directions, such as synthesizing a
unified formalism for insight and connecting theories of insight to other
critical concepts in visualization research.Comment: Technical report. arXiv admin note: text overlap with
arXiv:2206.0476
Dynamic Prefetching of Data Tiles for Interactive Visualization
In this paper, we present ForeCache, a general-purpose tool for exploratory browsing of large datasets. ForeCache utilizes a client-server architecture, where the user interacts with a lightweight client-side interface to browse datasets, and the data to be browsed is retrieved from a DBMS running on a back-end server. We assume a detail-on-demand browsing paradigm, and optimize the back-end support for this paradigm by inserting a separate middleware layer in front of the DBMS. To improve response times, the middleware layer fetches data ahead of the user as she explores a dataset. We consider two different mechanisms for prefetching: (a) learning what to fetch from the user's recent movements, and (b) using data characteristics (e.g., histograms) to find data similar to what the user has viewed in the past. We incorporate these mechanisms into a single prediction engine that adjusts its prediction strategies over time, based on changes in the user's behavior. We evaluated our prediction engine with a user study, and found that our dynamic prefetching strategy provides: (1) significant improvements in overall latency when compared with non-prefetching systems (430% improvement); and (2) substantial improvements in both prediction accuracy (25% improvement) and latency (88% improvement) relative to existing prefetching techniques
Interactive visualization of big data leveraging databases for scalable computation
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 55-57).Modern database management systems (DBMS) have been designed to efficiently store, manage and perform computations on massive amounts of data. In contrast, many existing visualization systems do not scale seamlessly from small data sets to enormous ones. We have designed a three-tiered visualization system called ScalaR to deal with this issue. ScalaR dynamically performs resolution reduction when the expected result of a DBMS query is too large to be effectively rendered on existing screen real estate. Instead of running the original query, ScalaR inserts aggregation, sampling or filtering operations to reduce the size of the result. This thesis presents the design and implementation of ScalaR, and shows results for two example applications, visualizing earthquake records and satellite imagery data, stored in SciDB as the back-end DBMS.by Leilani Marie Battle.S.M
How Do Data Science Workers Communicate Intermediate Results?
Data science workers increasingly collaborate on large-scale projects before
communicating insights to a broader audience in the form of visualization.
While prior work has modeled how data science teams, oftentimes with distinct
roles and work processes, communicate knowledge to outside stakeholders, we
have little knowledge of how data science workers communicate intermediately
before delivering the final products. In this work, we contribute a nuanced
description of the intermediate communication process within data science
teams. By analyzing interview data with 8 self-identified data science workers,
we characterized the data science intermediate communication process with four
factors, including the types of audience, communication goals, shared
artifacts, and mode of communication. We also identified overarching challenges
in the current communication process. We also discussed design implications
that might inform better tools that facilitate intermediate communication
within data science teams.Comment: This paper was accepted for presentation as part of the eighth
Symposium on Visualization in Data Science (VDS) at ACM KDD 2022 as well as
IEEE VIS 2022. http://www.visualdatascience.org/2022/index.htm
Toward a Scalable Census of Dashboard Designs in the Wild: A Case Study with Tableau Public
Dashboards remain ubiquitous artifacts for presenting or reasoning with data
across different domains. Yet, there has been little work that provides a
quantifiable, systematic, and descriptive overview of dashboard designs at
scale. We propose a schematic representation of dashboard designs as node-link
graphs to better understand their spatial and interactive structures. We apply
our approach to a dataset of 25,620 dashboards curated from Tableau Public to
provide a descriptive overview of the core building blocks of dashboards in the
wild and derive common dashboard design patterns. To guide future research, we
make our dashboard corpus publicly available and discuss its application toward
the development of dashboard design tools.Comment: *J. Purich and A. Srinivasan contributed equally to the wor
A provenance task abstraction framework
Visual analytics tools integrate provenance recording to externalize analytic processes or user insights. Provenance can be captured on varying levels of detail, and in turn activities can be characterized from different granularities. However, current approaches do not support inferring activities that can only be characterized across multiple levels of provenance. We propose a task abstraction framework that consists of a three stage approach, composed of (1) initializing a provenance task hierarchy, (2) parsing the provenance hierarchy by using an abstraction mapping mechanism, and (3) leveraging the task hierarchy in an analytical tool. Furthermore, we identify implications to accommodate iterative refinement, context, variability, and uncertainty during all stages of the framework. A use case describes exemplifies our abstraction framework, demonstrating how context can influence the provenance hierarchy to support analysis. The paper concludes with an agenda, raising and discussing challenges that need to be considered for successfully implementing such a framework
A novel approach to task abstraction to make better sense of provenance data
Working Group Report in 'Provenance and Logging for Sense Making' report from Dagstuhl Seminar 18462: Provenance and Logging for Sense Making, Dagstuhl Reports, Volume 8, Issue 1