2,001 research outputs found
Efficient Classification of Student Help Requests in Programming Courses Using Large Language Models
The accurate classification of student help requests with respect to the type
of help being sought can enable the tailoring of effective responses.
Automatically classifying such requests is non-trivial, but large language
models (LLMs) appear to offer an accessible, cost-effective solution. This
study evaluates the performance of the GPT-3.5 and GPT-4 models for classifying
help requests from students in an introductory programming class. In zero-shot
trials, GPT-3.5 and GPT-4 exhibited comparable performance on most categories,
while GPT-4 outperformed GPT-3.5 in classifying sub-categories for requests
related to debugging. Fine-tuning the GPT-3.5 model improved its performance to
such an extent that it approximated the accuracy and consistency across
categories observed between two human raters. Overall, this study demonstrates
the feasibility of using LLMs to enhance educational systems through the
automated classification of student needs
Communicative Agents for Software Development
Software engineering is a domain characterized by intricate decision-making
processes, often relying on nuanced intuition and consultation. Recent
advancements in deep learning have started to revolutionize software
engineering practices through elaborate designs implemented at various stages
of software development. In this paper, we present an innovative paradigm that
leverages large language models (LLMs) throughout the entire software
development process, streamlining and unifying key processes through natural
language communication, thereby eliminating the need for specialized models at
each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered
software development company that mirrors the established waterfall model,
meticulously dividing the development process into four distinct chronological
stages: designing, coding, testing, and documenting. Each stage engages a team
of agents, such as programmers, code reviewers, and test engineers, fostering
collaborative dialogue and facilitating a seamless workflow. The chat chain
acts as a facilitator, breaking down each stage into atomic subtasks. This
enables dual roles, allowing for proposing and validating solutions through
context-aware communication, leading to efficient resolution of specific
subtasks. The instrumental analysis of ChatDev highlights its remarkable
efficacy in software generation, enabling the completion of the entire software
development process in under seven minutes at a cost of less than one dollar.
It not only identifies and alleviates potential vulnerabilities but also
rectifies potential hallucinations while maintaining commendable efficiency and
cost-effectiveness. The potential of ChatDev unveils fresh possibilities for
integrating LLMs into the realm of software development.Comment: 25 pages, 9 figures, 2 table
Calendar.help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop
Although information workers may complain about meetings, they are an
essential part of their work life. Consequently, busy people spend a
significant amount of time scheduling meetings. We present Calendar.help, a
system that provides fast, efficient scheduling through structured workflows.
Users interact with the system via email, delegating their scheduling needs to
the system as if it were a human personal assistant. Common scheduling
scenarios are broken down using well-defined workflows and completed as a
series of microtasks that are automated when possible and executed by a human
otherwise. Unusual scenarios fall back to a trained human assistant who
executes them as unstructured macrotasks. We describe the iterative approach we
used to develop Calendar.help, and share the lessons learned from scheduling
thousands of meetings during a year of real-world deployments. Our findings
provide insight into how complex information tasks can be broken down into
repeatable components that can be executed efficiently to improve productivity.Comment: 10 page
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
This technical report presents AutoGen, a new framework that enables
development of LLM applications using multiple agents that can converse with
each other to solve tasks. AutoGen agents are customizable, conversable, and
seamlessly allow human participation. They can operate in various modes that
employ combinations of LLMs, human inputs, and tools. AutoGen's design offers
multiple advantages: a) it gracefully navigates the strong but imperfect
generation and reasoning abilities of these LLMs; b) it leverages human
understanding and intelligence, while providing valuable automation through
conversations between agents; c) it simplifies and unifies the implementation
of complex LLM workflows as automated agent chats. We provide many diverse
examples of how developers can easily use AutoGen to effectively solve tasks or
build applications, ranging from coding, mathematics, operations research,
entertainment, online decision-making, question answering, etc.Comment: 28 page
Using ChatGPT and other LLMs in Professional Environments
Large language models like ChatGPT, Google’s Bard, and Microsoft’s new Bing, to name a few, are developing rapidly in recent years, becoming very popular in different environments, and supporting a wide range of tasks. A deep look into their outcomes reveals several limitations and challenges that can be further improved. The main challenge of these models is the possibility of generating biased or inaccurate results, since these models rely on large amounts of data with no access to unpublic information. Moreover, these language models need to be properly monitored and trained to prevent generating inappropriate or offensive content and to ensure that they are used ethically and safely. This study investigates the use of ChatGPT and other large language models such as Blender, and BERT in professional environments. It has been found that none of the large language models, including ChatGPT, have been used in unstructured dialogues. Moreover, involving the models in professional environments requires extensive training and monitoring by domain professionals or fine-tuning through API
Recommended from our members
Next generation software environments : principles, problems, and research directions
The past decade has seen a burgeoning of research and development in software environments. Conferences have been devoted to the topic of practical environments, journal papers produced, and commercial systems sold. Given all the activity, one might expect a great deal of consensus on issues, approaches, and techniques. This is not the case, however. Indeed, the term "environment" is still used in a variety of conflicting ways. Nevertheless substantial progress has been made and we are at least nearing consensus on many critical issues.The purpose of this paper is to characterize environments, describe several important principles that have emerged in the last decade or so, note current open problems, and describe some approaches to these problems, with particular emphasis on the activities of one large-scale research program, the Arcadia project. Consideration is also given to two related topics: empirical evaluation and technology transition. That is, how can environments and their constituents be evaluated, and how can new developments be moved effectively into the production sector
- …