Search CORE

155 research outputs found

rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

Author: Heer Jeffrey
Jun Eunice
Just René
Misback Edward
Publication venue
Publication date: 24/10/2023
Field of study

Statistical models should accurately reflect analysts' domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory study of analysts using a domain-specific language (DSL) to express conceptual models. We observe a preference for detailing how variables relate and a desire to allow, and then later resolve, ambiguity in their conceptual models. We leverage these findings to develop rTisane, a DSL for expressing conceptual models augmented with an interactive disambiguation process. In a controlled evaluation, we find that rTisane's DSL helps analysts engage more deeply with and accurately externalize their assumptions. rTisane also leads to statistical models that match analysts' assumptions, maintain analysis intent, and better fit the data

arXiv.org e-Print Archive

Orion: A system for modeling, transformation and visualization of multidimensional heterogeneous networks

Author: Adam Perer
Jeffrey Heer
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

The study of complex activities such as scientific production and software development often require modeling connections among heterogeneous entities including people, institutions and artifacts. Despite numerous advances in algorithms and visualization techniques for understanding such social networks, the process of constructing network models and performing exploratory analysis remains difficult and time-consuming. In this paper we present Orion, a system for interactive modeling, transformation and visualization of network data. Orion’s interface enables the rapid manipulation of large graphs — including the specification of complex linking relationships — using simple drag-and-drop operations with desired node types. Orion maps these user interactions to statements in a declarative workflow language that incorporates both relational operators (e.g., selection, aggregation and joins) and network analytics (e.g., centrality measures). We demonstrate how these features enable analysts to flexibly construct and compare networks in domains such as online health communities, academic collaboration and distributed software development

CiteSeerX

Crossref

EVM: Incorporating Model Checking into Exploratory Visual Analysis

Author: Guo Ziyang
Heer Jeffrey
Hullman Jessica
Kale Alex
Qiao Xiao Li
Publication venue
Publication date: 24/08/2023
Field of study

Visual analytics (VA) tools support data exploration by helping analysts quickly and iteratively generate views of data which reveal interesting patterns. However, these tools seldom enable explicit checks of the resulting interpretations of data -- e.g., whether patterns can be accounted for by a model that implies a particular structure in the relationships between variables. We present EVM, a data exploration tool that enables users to express and check provisional interpretations of data in the form of statistical models. EVM integrates support for visualization-based model checks by rendering distributions of model predictions alongside user-generated views of data. In a user study with data scientists practicing in the private and public sector, we evaluate how model checks influence analysts' thinking during data exploration. Our analysis characterizes how participants use model checks to scrutinize expectations about data generating process and surfaces further opportunities to scaffold model exploration in VA tools

arXiv.org e-Print Archive

ScatterShot: Interactive In-context Example Curation for Text Transformation

Author: Heer Jeffrey
Ribeiro Marco Tulio
Shen Hua
Weld Daniel S.
Wu Tongshuang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/02/2023
Field of study

The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples, resulting in underspecified in-context functions that fall short on unseen cases. Further, it is hard to know when "enough" examples have been included even for known patterns. In this work, we present ScatterShot, an interactive system for building high-quality demonstration sets for in-context learning. ScatterShot iteratively slices unlabeled data into task-specific patterns, samples informative inputs from underexplored or not-yet-saturated slices in an active learning manner, and helps users label more efficiently with the help of an LLM and the current example set. In simulation studies on two text perturbation scenarios, ScatterShot sampling improves the resulting few-shot functions by 4-5 percentage points over random sampling, with less variance as more examples are added. In a user study, ScatterShot greatly helps users in covering different patterns in the input space and labeling in-context examples more efficiently, resulting in better in-context learning and less user effort.Comment: IUI 2023: 28th International Conference on Intelligent User Interface

arXiv.org e-Print Archive

How Do Data Analysts Respond to AI Assistance? A Wizard-of-Oz Study

Author: Althoff Tim
Grunde-McLaughlin Madeleine
Gu Ken
Heer Jeffrey
McNutt Andrew M.
Publication venue
Publication date: 04/03/2024
Field of study

Data analysis is challenging as analysts must navigate nuanced decisions that may yield divergent conclusions. AI assistants have the potential to support analysts in planning their analyses, enabling more robust decision making. Though AI-based assistants that target code execution (e.g., Github Copilot) have received significant attention, limited research addresses assistance for both analysis execution and planning. In this work, we characterize helpful planning suggestions and their impacts on analysts' workflows. We first review the analysis planning literature and crowd-sourced analysis studies to categorize suggestion content. We then conduct a Wizard-of-Oz study (n=13) to observe analysts' preferences and reactions to planning assistance in a realistic scenario. Our findings highlight subtleties in contextual factors that impact suggestion helpfulness, emphasizing design implications for supporting different abstractions of assistance, forms of initiative, increased engagement, and alignment of goals between analysts and assistants.Comment: Accepted to CHI 202

arXiv.org e-Print Archive