379 research outputs found

    Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities

    Full text link
    Large Language Models (LLMs) are being increasingly employed in data science for tasks like data preprocessing and analytics. However, data scientists encounter substantial obstacles when conversing with LLM-powered chatbots and acting on their suggestions and answers. We conducted a mixed-methods study, including contextual observations, semi-structured interviews (n=14), and a survey (n=114), to identify these challenges. Our findings highlight key issues faced by data scientists, including contextual data retrieval, formulating prompts for complex tasks, adapting generated code to local environments, and refining prompts iteratively. Based on these insights, we propose actionable design recommendations, such as data brushing to support context selection, and inquisitive feedback loops to improve communications with AI-based assistants in data-science tools.Comment: 24 pages, 8 figure

    Bridging the Gulf of Envisioning: Cognitive Design Challenges in LLM Interfaces

    Full text link
    Large language models (LLMs) exhibit dynamic capabilities and appear to comprehend complex and ambiguous natural language prompts. However, calibrating LLM interactions is challenging for interface designers and end-users alike. A central issue is our limited grasp of how human cognitive processes begin with a goal and form intentions for executing actions, a blindspot even in established interaction models such as Norman's gulfs of execution and evaluation. To address this gap, we theorize how end-users 'envision' translating their goals into clear intentions and craft prompts to obtain the desired LLM response. We define a process of Envisioning by highlighting three misalignments: (1) knowing whether LLMs can accomplish the task, (2) how to instruct the LLM to do the task, and (3) how to evaluate the success of the LLM's output in meeting the goal. Finally, we make recommendations to narrow the envisioning gulf in human-LLM interactions

    ABScribe: Rapid Exploration of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models

    Full text link
    Exploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art large language models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new versions without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing workload and disrupting writers' flow. To tackle this, we present ABScribe, an interface that supports rapid, yet visually structured, exploration of writing variations in human-AI co-writing tasks. With ABScribe, users can swiftly produce multiple variations using LLM prompts, which are auto-converted into reusable buttons. Variations are stored adjacently within text segments for rapid in-place comparisons using mouse-over interactions on a context toolbar. Our user study with 12 writers shows that ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances user perceptions of the revision process (d = 2.41, p < 0.001) compared to a popular baseline workflow, and provides insights into how writers explore variations using LLMs

    Reputation Agent: Prompting Fair Reviews in Gig Markets

    Full text link
    Our study presents a new tool, Reputation Agent, to promote fairer reviews from requesters (employers or customers) on gig markets. Unfair reviews, created when requesters consider factors outside of a worker's control, are known to plague gig workers and can result in lost job opportunities and even termination from the marketplace. Our tool leverages machine learning to implement an intelligent interface that: (1) uses deep learning to automatically detect when an individual has included unfair factors into her review (factors outside the worker's control per the policies of the market); and (2) prompts the individual to reconsider her review if she has incorporated unfair factors. To study the effectiveness of Reputation Agent, we conducted a controlled experiment over different gig markets. Our experiment illustrates that across markets, Reputation Agent, in contrast with traditional approaches, motivates requesters to review gig workers' performance more fairly. We discuss how tools that bring more transparency to employers about the policies of a gig market can help build empathy thus resulting in reasoned discussions around potential injustices towards workers generated by these interfaces. Our vision is that with tools that promote truth and transparency we can bring fairer treatment to gig workers.Comment: 12 pages, 5 figures, The Web Conference 2020, ACM WWW 202

    Sensecape: Enabling Multilevel Exploration and Sensemaking with Large Language Models

    Full text link
    People are increasingly turning to large language models (LLMs) for complex information tasks like academic research or planning a move to another city. However, while they often require working in a nonlinear manner - e.g., to arrange information spatially to organize and make sense of it, current interfaces for interacting with LLMs are generally linear to support conversational interaction. To address this limitation and explore how we can support LLM-powered exploration and sensemaking, we developed Sensecape, an interactive system designed to support complex information tasks with an LLM by enabling users to (1) manage the complexity of information through multilevel abstraction and (2) seamlessly switch between foraging and sensemaking. Our within-subject user study reveals that Sensecape empowers users to explore more topics and structure their knowledge hierarchically. We contribute implications for LLM-based workflows and interfaces for information tasks

    Heuristik 4.0 - Heuristiken zur Evaluation digitalisierter Arbeit bei Industrie-4.0 und KI-basierten Systemen aus soziotechnischer Perspektive

    Get PDF
    Der zunehmende Einzug von Digitaltechnik in die Industrie ermöglicht produzierenden Betrieben, agiler auf interne und externe Anforderungen zu reagieren. Software-Updates können schnell und in skalierbarem Umfang Anpassungen realisieren, z.B. in Bezug auf die algorithmische Entscheidungsherleitung, das Verhalten autonomer Teilsysteme oder die Schnittstellen, über die Mitarbeiter_innen mit dem technischen System interagieren. Vorgehensmodelle zur kontinuierlichen Weiterentwicklung, wie sie sich in dynamischen Branchen etabliert haben, sind in der Regel durch iterative Zyklen gekennzeichnet, in denen die Planung, die Umsetzung und die Evaluation von Veränderungsmaßnahmen zentrale Schritte sind. Um komplexe soziotechnische Settings unter den Bedingungen digitalisierter industrieller Arbeit ausreichend schnell evaluieren zu können, schlagen wir acht Heuristiken vor. Diese Studie stellt diese Heuristiken vor sowie Hinweise zu ihrem Einsatz und die dahinterliegende Methodik.Digitization increases the agility of manufacturing companies and enables them to adapt quickly in respect to internal and external needs. Software updates can push modifications fast and at any scale, e. g. in regard to decision inferring algorithms, the behavior of autonomous subsystems or the system's user interface. Established processes in other agile domains feature iterative development cycles that include the planning, the implementation and the evaluation of change. We propose eight heuristics to support a sufficiently fast evaluation of complex socio-technical settings, as they can be found in digitized working conditions in manufacturing. This study presents these heuristics, gives guidance on how to use them and explains their methodological background

    The effects of Ajax web technologies on user expectations: a workflow approach

    Get PDF
    This paper aims to define users' information expectations as web technologies continue to improve in loading time and uninterrupted interface interactivity. Do web technologies like Ajax-or, more abstractly, a quicker fulfilling of user needs- change these needs, or do they merely fulfill preexisting expectations? Users navigated through a mock e-commerce site where each page that loads has a 50% chance of implementing Ajax technology, from functions of the shopping cart to expanding categories of products. Users were observed through eye tracking and measuring their pulse and respiratory effort. Questionnaires were administered before and after these tasks to assess their thoughts about the study. Qualitative and quatitative observation found users almost unanimously favored the Ajax functions over the non-Ajax. Users emphasized the usability concerns of switching to Ajax, especially concerning feedback

    Five Lenses on Team Tutor Challenges: A Multidisciplinary Approach

    Get PDF
    This chapter describes five disciplinary domains of research or lenses that contribute to the design of a team tutor. We focus on four significant challenges in developing Intelligent Team Tutoring Systems (ITTSs), and explore how the five lenses can offer guidance for these challenges. The four challenges arise in the design of team member interactions, performance metrics and skill development, feedback, and tutor authoring. The five lenses or research domains that we apply to these four challenges are Tutor Engineering, Learning Sciences, Science of Teams, Data Analyst, and Human–Computer Interaction. This matrix of applications from each perspective offers a framework to guide designers in creating ITTSs
    • …
    corecore