Search CORE

2,001 research outputs found

Efficient Classification of Student Help Requests in Programming Courses Using Large Language Models

Author: Denny Paul
Liffiton Mark
Savelka Jaromir
Sheese Brad
Publication venue
Publication date: 30/10/2023
Field of study

The accurate classification of student help requests with respect to the type of help being sought can enable the tailoring of effective responses. Automatically classifying such requests is non-trivial, but large language models (LLMs) appear to offer an accessible, cost-effective solution. This study evaluates the performance of the GPT-3.5 and GPT-4 models for classifying help requests from students in an introductory programming class. In zero-shot trials, GPT-3.5 and GPT-4 exhibited comparable performance on most categories, while GPT-4 outperformed GPT-3.5 in classifying sub-categories for requests related to debugging. Fine-tuning the GPT-3.5 model improved its performance to such an extent that it approximated the accuracy and consistency across categories observed between two human raters. Overall, this study demonstrates the feasibility of using LLMs to enhance educational systems through the automated classification of student needs

arXiv.org e-Print Archive

Communicative Agents for Software Development

Author: Chen Weize
Cong Xin
Liu Zhiyuan
Qian Chen
Su Yusheng
Sun Maosong
Xu Juyuan
Yang Cheng
Publication venue
Publication date: 18/07/2023
Field of study

Software engineering is a domain characterized by intricate decision-making processes, often relying on nuanced intuition and consultation. Recent advancements in deep learning have started to revolutionize software engineering practices through elaborate designs implemented at various stages of software development. In this paper, we present an innovative paradigm that leverages large language models (LLMs) throughout the entire software development process, streamlining and unifying key processes through natural language communication, thereby eliminating the need for specialized models at each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered software development company that mirrors the established waterfall model, meticulously dividing the development process into four distinct chronological stages: designing, coding, testing, and documenting. Each stage engages a team of agents, such as programmers, code reviewers, and test engineers, fostering collaborative dialogue and facilitating a seamless workflow. The chat chain acts as a facilitator, breaking down each stage into atomic subtasks. This enables dual roles, allowing for proposing and validating solutions through context-aware communication, leading to efficient resolution of specific subtasks. The instrumental analysis of ChatDev highlights its remarkable efficacy in software generation, enabling the completion of the entire software development process in under seven minutes at a cost of less than one dollar. It not only identifies and alleviates potential vulnerabilities but also rectifies potential hallucinations while maintaining commendable efficiency and cost-effectiveness. The potential of ChatDev unveils fresh possibilities for integrating LLMs into the realm of software development.Comment: 25 pages, 9 figures, 2 table

arXiv.org e-Print Archive

Calendar.help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop

Author: Cranshaw Justin
Elwany Emad
Kocielnik Rafal
Monroy-Hernández Andrés
Newman Todd
Soni Sandeep
Teevan Jaime
Yu Bowen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/03/2017
Field of study

Although information workers may complain about meetings, they are an essential part of their work life. Consequently, busy people spend a significant amount of time scheduling meetings. We present Calendar.help, a system that provides fast, efficient scheduling through structured workflows. Users interact with the system via email, delegating their scheduling needs to the system as if it were a human personal assistant. Common scheduling scenarios are broken down using well-defined workflows and completed as a series of microtasks that are automated when possible and executed by a human otherwise. Unusual scenarios fall back to a trained human assistant who executes them as unstructured macrotasks. We describe the iterative approach we used to develop Calendar.help, and share the lessons learned from scheduling thousands of meetings during a year of real-world deployments. Our findings provide insight into how complex information tasks can be broken down into repeatable components that can be executed efficiently to improve productivity.Comment: 10 page

arXiv.org e-Print Archive

Crossref

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Author: Bansal Gagan
Jiang Li
Li Beibin
Wang Chi
Wu Qingyun
Wu Yiran
Zhang Jieyu
Zhang Shaokun
Zhang Xiaoyun
Zhu Erkang
Publication venue
Publication date: 16/08/2023
Field of study

This technical report presents AutoGen, a new framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools. AutoGen's design offers multiple advantages: a) it gracefully navigates the strong but imperfect generation and reasoning abilities of these LLMs; b) it leverages human understanding and intelligence, while providing valuable automation through conversations between agents; c) it simplifies and unifies the implementation of complex LLM workflows as automated agent chats. We provide many diverse examples of how developers can easily use AutoGen to effectively solve tasks or build applications, ranging from coding, mathematics, operations research, entertainment, online decision-making, question answering, etc.Comment: 28 page

arXiv.org e-Print Archive

Using ChatGPT and other LLMs in Professional Environments

Author: Alaswad S.
Kalganova T.
S. Awad W.
Publication venue: Arab Journals Platform
Publication date: 15/09/2023
Field of study

Large language models like ChatGPT, Google’s Bard, and Microsoft’s new Bing, to name a few, are developing rapidly in recent years, becoming very popular in different environments, and supporting a wide range of tasks. A deep look into their outcomes reveals several limitations and challenges that can be further improved. The main challenge of these models is the possibility of generating biased or inaccurate results, since these models rely on large amounts of data with no access to unpublic information. Moreover, these language models need to be properly monitored and trained to prevent generating inappropriate or offensive content and to ensure that they are used ethically and safely. This study investigates the use of ChatGPT and other large language models such as Blender, and BERT in professional environments. It has been found that none of the large language models, including ChatGPT, have been used in unstructured dialogues. Moreover, involving the models in professional environments requires extensive training and monitoring by domain professionals or fine-tuning through API

Arab Journals Platform

Recommended from our members

Next generation software environments : principles, problems, and research directions

Author: Baker Deborah A.
Belz Frank C.
Boehm Barry W.
Clarke Lori A.
Fisher David A.
Osterweil Leon
Selby Richard W.
Taylor Richard N.
Wileden Jack C.
Wolf Alexander L.
Young Michal
Publication venue: eScholarship, University of California
Publication date: 15/07/1987
Field of study

The past decade has seen a burgeoning of research and development in software environments. Conferences have been devoted to the topic of practical environments, journal papers produced, and commercial systems sold. Given all the activity, one might expect a great deal of consensus on issues, approaches, and techniques. This is not the case, however. Indeed, the term "environment" is still used in a variety of conflicting ways. Nevertheless substantial progress has been made and we are at least nearing consensus on many critical issues.The purpose of this paper is to characterize environments, describe several important principles that have emerged in the last decade or so, note current open problems, and describe some approaches to these problems, with particular emphasis on the activities of one large-scale research program, the Arcadia project. Consideration is also given to two related topics: empirical evaluation and technology transition. That is, how can environments and their constituents be evaluated, and how can new developments be moved effectively into the production sector

eScholarship - University of California