379 research outputs found
Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities
Large Language Models (LLMs) are being increasingly employed in data science
for tasks like data preprocessing and analytics. However, data scientists
encounter substantial obstacles when conversing with LLM-powered chatbots and
acting on their suggestions and answers. We conducted a mixed-methods study,
including contextual observations, semi-structured interviews (n=14), and a
survey (n=114), to identify these challenges. Our findings highlight key issues
faced by data scientists, including contextual data retrieval, formulating
prompts for complex tasks, adapting generated code to local environments, and
refining prompts iteratively. Based on these insights, we propose actionable
design recommendations, such as data brushing to support context selection, and
inquisitive feedback loops to improve communications with AI-based assistants
in data-science tools.Comment: 24 pages, 8 figure
Bridging the Gulf of Envisioning: Cognitive Design Challenges in LLM Interfaces
Large language models (LLMs) exhibit dynamic capabilities and appear to
comprehend complex and ambiguous natural language prompts. However, calibrating
LLM interactions is challenging for interface designers and end-users alike. A
central issue is our limited grasp of how human cognitive processes begin with
a goal and form intentions for executing actions, a blindspot even in
established interaction models such as Norman's gulfs of execution and
evaluation. To address this gap, we theorize how end-users 'envision'
translating their goals into clear intentions and craft prompts to obtain the
desired LLM response. We define a process of Envisioning by highlighting three
misalignments: (1) knowing whether LLMs can accomplish the task, (2) how to
instruct the LLM to do the task, and (3) how to evaluate the success of the
LLM's output in meeting the goal. Finally, we make recommendations to narrow
the envisioning gulf in human-LLM interactions
ABScribe: Rapid Exploration of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models
Exploring alternative ideas by rewriting text is integral to the writing
process. State-of-the-art large language models (LLMs) can simplify writing
variation generation. However, current interfaces pose challenges for
simultaneous consideration of multiple variations: creating new versions
without overwriting text can be difficult, and pasting them sequentially can
clutter documents, increasing workload and disrupting writers' flow. To tackle
this, we present ABScribe, an interface that supports rapid, yet visually
structured, exploration of writing variations in human-AI co-writing tasks.
With ABScribe, users can swiftly produce multiple variations using LLM prompts,
which are auto-converted into reusable buttons. Variations are stored
adjacently within text segments for rapid in-place comparisons using mouse-over
interactions on a context toolbar. Our user study with 12 writers shows that
ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances
user perceptions of the revision process (d = 2.41, p < 0.001) compared to a
popular baseline workflow, and provides insights into how writers explore
variations using LLMs
Reputation Agent: Prompting Fair Reviews in Gig Markets
Our study presents a new tool, Reputation Agent, to promote fairer reviews
from requesters (employers or customers) on gig markets. Unfair reviews,
created when requesters consider factors outside of a worker's control, are
known to plague gig workers and can result in lost job opportunities and even
termination from the marketplace. Our tool leverages machine learning to
implement an intelligent interface that: (1) uses deep learning to
automatically detect when an individual has included unfair factors into her
review (factors outside the worker's control per the policies of the market);
and (2) prompts the individual to reconsider her review if she has incorporated
unfair factors. To study the effectiveness of Reputation Agent, we conducted a
controlled experiment over different gig markets. Our experiment illustrates
that across markets, Reputation Agent, in contrast with traditional approaches,
motivates requesters to review gig workers' performance more fairly. We discuss
how tools that bring more transparency to employers about the policies of a gig
market can help build empathy thus resulting in reasoned discussions around
potential injustices towards workers generated by these interfaces. Our vision
is that with tools that promote truth and transparency we can bring fairer
treatment to gig workers.Comment: 12 pages, 5 figures, The Web Conference 2020, ACM WWW 202
Sensecape: Enabling Multilevel Exploration and Sensemaking with Large Language Models
People are increasingly turning to large language models (LLMs) for complex
information tasks like academic research or planning a move to another city.
However, while they often require working in a nonlinear manner - e.g., to
arrange information spatially to organize and make sense of it, current
interfaces for interacting with LLMs are generally linear to support
conversational interaction. To address this limitation and explore how we can
support LLM-powered exploration and sensemaking, we developed Sensecape, an
interactive system designed to support complex information tasks with an LLM by
enabling users to (1) manage the complexity of information through multilevel
abstraction and (2) seamlessly switch between foraging and sensemaking. Our
within-subject user study reveals that Sensecape empowers users to explore more
topics and structure their knowledge hierarchically. We contribute implications
for LLM-based workflows and interfaces for information tasks
Heuristik 4.0 - Heuristiken zur Evaluation digitalisierter Arbeit bei Industrie-4.0 und KI-basierten Systemen aus soziotechnischer Perspektive
Der zunehmende Einzug von Digitaltechnik in die Industrie ermöglicht produzierenden Betrieben, agiler auf interne und externe Anforderungen zu reagieren. Software-Updates können schnell und in skalierbarem Umfang Anpassungen realisieren, z.B. in Bezug auf die algorithmische Entscheidungsherleitung, das Verhalten autonomer Teilsysteme oder die Schnittstellen, über die Mitarbeiter_innen mit dem technischen System interagieren. Vorgehensmodelle zur kontinuierlichen Weiterentwicklung, wie sie sich in dynamischen Branchen etabliert haben, sind in der Regel durch iterative Zyklen gekennzeichnet, in denen die Planung, die Umsetzung und die Evaluation von Veränderungsmaßnahmen zentrale Schritte sind. Um komplexe soziotechnische Settings unter den Bedingungen digitalisierter industrieller Arbeit ausreichend schnell evaluieren zu können, schlagen wir acht Heuristiken vor. Diese Studie stellt diese Heuristiken vor sowie Hinweise zu ihrem Einsatz und die dahinterliegende Methodik.Digitization increases the agility of manufacturing companies and enables them to adapt quickly in respect to internal and external needs. Software updates can push modifications fast and at any scale, e. g. in regard to decision inferring algorithms, the behavior of autonomous subsystems or the system's user interface. Established processes in other agile domains feature iterative development cycles that include the planning, the implementation and the evaluation of change. We propose eight heuristics to support a sufficiently fast evaluation of complex socio-technical settings, as they can be found in digitized working conditions in manufacturing. This study presents these heuristics, gives guidance on how to use them and explains their methodological background
The effects of Ajax web technologies on user expectations: a workflow approach
This paper aims to define users' information expectations as web technologies continue to improve in loading time and uninterrupted interface interactivity. Do web technologies like Ajax-or, more abstractly, a quicker fulfilling of user needs- change these needs, or do they merely fulfill preexisting expectations? Users navigated through a mock e-commerce site where each page that loads has a 50% chance of implementing Ajax technology, from functions of the shopping cart to expanding categories of products. Users were observed through eye tracking and measuring their pulse and respiratory effort. Questionnaires were administered before and after these tasks to assess their thoughts about the study. Qualitative and quatitative observation found users almost unanimously favored the Ajax functions over the non-Ajax. Users emphasized the usability concerns of switching to Ajax, especially concerning feedback
Five Lenses on Team Tutor Challenges: A Multidisciplinary Approach
This chapter describes five disciplinary domains of research or lenses that contribute to the design of a team tutor. We focus on four significant challenges in developing Intelligent Team Tutoring Systems (ITTSs), and explore how the five lenses can offer guidance for these challenges. The four challenges arise in the design of team member interactions, performance metrics and skill development, feedback, and tutor authoring. The five lenses or research domains that we apply to these four challenges are Tutor Engineering, Learning Sciences, Science of Teams, Data Analyst, and Human–Computer Interaction. This matrix of applications from each perspective offers a framework to guide designers in creating ITTSs
- …