21 research outputs found

    Supporting Annotators with Affordances for Efficiently Labeling Conversational Data

    Full text link
    Without well-labeled ground truth data, machine learning-based systems would not be as ubiquitous as they are today, but these systems rely on substantial amounts of correctly labeled data. Unfortunately, crowdsourced labeling is time consuming and expensive. To address the concerns of effort and tedium, we designed CAL, a novel interface to aid in data labeling. We made several key design decisions for CAL, which include preventing inapt labels from being selected, guiding users in selecting an appropriate label when they need assistance, incorporating labeling documentation into the interface, and providing an efficient means to view previous labels. We implemented a production-quality implementation of CAL and report a user-study evaluation that compares CAL to a standard spreadsheet. Key findings of our study include users using CAL reported lower cognitive load, did not increase task time, users rated CAL to be easier to use, and users preferred CAL over the spreadsheet

    Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities

    Full text link
    Large Language Models (LLMs) are being increasingly employed in data science for tasks like data preprocessing and analytics. However, data scientists encounter substantial obstacles when conversing with LLM-powered chatbots and acting on their suggestions and answers. We conducted a mixed-methods study, including contextual observations, semi-structured interviews (n=14), and a survey (n=114), to identify these challenges. Our findings highlight key issues faced by data scientists, including contextual data retrieval, formulating prompts for complex tasks, adapting generated code to local environments, and refining prompts iteratively. Based on these insights, we propose actionable design recommendations, such as data brushing to support context selection, and inquisitive feedback loops to improve communications with AI-based assistants in data-science tools.Comment: 24 pages, 8 figure

    CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator Needs

    Full text link
    Timely, personalized feedback is essential for students learning programming. LLM-powered tools like ChatGPT offer instant support, but reveal direct answers with code, which may hinder deep conceptual engagement. We developed CodeAid, an LLM-powered programming assistant delivering helpful, technically correct responses, without revealing code solutions. CodeAid answers conceptual questions, generates pseudo-code with line-by-line explanations, and annotates student's incorrect code with fix suggestions. We deployed CodeAid in a programming class of 700 students for a 12-week semester. A thematic analysis of 8,000 usages of CodeAid was performed, further enriched by weekly surveys, and 22 student interviews. We then interviewed eight programming educators to gain further insights. Our findings reveal four design considerations for future educational AI assistants: D1) exploiting AI's unique benefits; D2) simplifying query formulation while promoting cognitive engagement; D3) avoiding direct responses while encouraging motivated learning; and D4) maintaining transparency and control for students to asses and steer AI responses.Comment: CHI 2024 Paper - The paper includes 17 pages, 8 figures, 2 tables, along with a 2-page appendi

    Semantically Aligned Question and Code Generation for Automated Insight Generation

    Full text link
    Automated insight generation is a common tactic for helping knowledge workers, such as data scientists, to quickly understand the potential value of new and unfamiliar data. Unfortunately, automated insights produced by large-language models can generate code that does not correctly correspond (or align) to the insight. In this paper, we leverage the semantic knowledge of large language models to generate targeted and insightful questions about data and the corresponding code to answer those questions. Then through an empirical study on data from Open-WikiTable, we show that embeddings can be effectively used for filtering out semantically unaligned pairs of question and code. Additionally, we found that generating questions and code together yields more diverse questions

    To Fix or to Learn? How Production Bias Affects Developers’ Information Foraging during Debugging

    Get PDF
    Developers performing maintenance activities must balance their efforts to learn the code vs. their efforts to actually change it. This balancing act is consistent with the “production bias” that, according to Carroll’s minimalist learning theory, generally affects software users during everyday tasks. This suggests that developers’ focus on efficiency should have marked effects on how they forage for the information they think they need to fix bugs. To investigate how developers balance fixing versus learning during debugging, we conducted the first empirical investigation of the interplay between production bias and information foraging. Our theory-based study involved 11 participants: half tasked with fixing a bug, and half tasked with learning enough to help someone else fix it. Despite the subtlety of difference between their tasks, participants foraged remarkably differently—making foraging decisions from different types of “patches,” with different types of information, and succeeding with different foraging tactics

    Interface Fluctuations on a Hierarchical Lattice

    Full text link
    We consider interface fluctuations on a two-dimensional layered lattice where the couplings follow a hierarchical sequence. This problem is equivalent to the diffusion process of a quantum particle in the presence of a one-dimensional hierarchical potential. According to a modified Harris criterion this type of perturbation is relevant and one expects anomalous fluctuating behavior. By transfer-matrix techniques and by an exact renormalization group transformation we have obtained analytical results for the interface fluctuation exponents, which are discontinuous at the homogeneous lattice limit.Comment: 14 pages plain Tex, one Figure upon request, Phys Rev E (in print

    A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

    Get PDF
    Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.Comment: Status: Accepted at Empirical Software Engineerin

    Yestercode: Improving code-change support in visual dataflow programming environments

    No full text
    In this paper, we present the Yestercode tool for supporting code changes in visual dataflow programming environments. In a formative investigation of LabVIEW programmers, we found that making code changes posed a significant challenge. To address this issue, we designed Yestercode to enable the efficient recording, retrieval, and juxtaposition of visual dataflow code while making code changes. To evaluate Yestercode, we implemented our design as a prototype extension to the LabVIEW programming environment, and ran a user study involving 14 professional LabVIEW programmers that compared Yestercode-extended LabVIEW to the standard LabVIEW IDE. Our results showed that Yestercode users introduced fewer bugs during tasks, completed tasks in about the same time, and experienced lower cognitive loads on tasks. Moreover, participants generally reported that Yestercode was easy to use and that it helped in making change tasks easier

    The patchworks code editor

    No full text
    Increasingly, people are faced with navigating large information spaces, and making such navigation efficient is of paramount concern. In this paper, we focus on the problems programmers face in navigating large code bases, and propose a novel code editor, Patchworks, that addresses the problems. In particular, Patchworks leverages two new interface idioms-the patch grid and the ribbon-to help programmers navigate more quickly, make fewer navigation errors, and spend less time arranging their code. To validate Patchworks, we conducted a user study that compared Patchworks to two existing code editors: The traditional file-based editor, Eclipse, and the newer canvas-based editor, Code Bubbles. Our results showed (1) that programmers using Patchworks were able to navigate significantly faster than with Eclipse (and comparably with Code Bubbles), (2) that programmers using Patchworks made significantly fewer navigation errors than with Code Bubbles or Eclipse, and (3) that programmers using Patchworks spent significantly less time arranging their code than with Code Bubbles (and comparably with Eclipse). Copyright © 2014 ACM
    corecore