3,090 research outputs found
"Always Nice and Confident, Sometimes wrong": Developer's Experiences Engaging Generative AI Chatbots Versus Human-Powered Q&A Platforms
Software engineers have historically relied on human-powered Q&A platforms,
like Stack Overflow (SO), as coding aids. With the rise of generative AI,
developers have adopted AI chatbots, such as ChatGPT, in their software
development process. Recognizing the potential parallels between human-powered
Q&A platforms and AI-powered question-based chatbots, we investigate and
compare how developers integrate this assistance into their real-world coding
experiences by conducting thematic analysis of Reddit posts. Through a
comparative study of SO and ChatGPT, we identified each platform's strengths,
use cases, and barriers. Our findings suggest that ChatGPT offers fast, clear,
comprehensive responses and fosters a more respectful environment than SO.
However, concerns about ChatGPT's reliability stem from its overly confident
tone and the absence of validation mechanisms like SO's voting system. Based on
these findings, we recommend leveraging each platform's unique features to
improve developer experiences in the future
Automatic Prediction of Rejected Edits in Stack Overflow
The content quality of shared knowledge in Stack Overflow (SO) is crucial in
supporting software developers with their programming problems. Thus, SO allows
its users to suggest edits to improve the quality of a post (i.e., question and
answer). However, existing research shows that many suggested edits in SO are
rejected due to undesired contents/formats or violating edit guidelines. Such a
scenario frustrates or demotivates users who would like to conduct good-quality
edits. Therefore, our research focuses on assisting SO users by offering them
suggestions on how to improve their editing of posts. First, we manually
investigate 764 (382 questions + 382 answers) rejected edits by rollbacks and
produce a catalog of 19 rejection reasons. Second, we extract 15 texts and
user-based features to capture those rejection reasons. Third, we develop four
machine learning models using those features. Our best-performing model can
predict rejected edits with 69.1% precision, 71.2% recall, 70.1% F1-score, and
69.8% overall accuracy. Fourth, we introduce an online tool named EditEx that
works with the SO edit system. EditEx can assist users while editing posts by
suggesting the potential causes of rejections. We recruit 20 participants to
assess the effectiveness of EditEx. Half of the participants (i.e., treatment
group) use EditEx and another half (i.e., control group) use the SO standard
edit system to edit posts. According to our experiment, EditEx can support SO
standard edit system to prevent 49% of rejected edits, including the commonly
rejected ones. However, it can prevent 12% rejections even in free-form regular
edits. The treatment group finds the potential rejection reasons identified by
EditEx influential. Furthermore, the median workload suggesting edits using
EditEx is half compared to the SO edit system.Comment: Accepted for publication in Empirical Software Engineering (EMSE)
journa
A Behavior-Driven Recommendation System for Stack Overflow Posts
Developers are often tasked with maintaining complex systems. Regardless of prior experience, there will inevitably be times in which they must interact with parts of the system with which they are unfamiliar. In such cases, recommendation systems may serve as a valuable tool to assist the developer in implementing a solution. Many recommendation systems in software engineering utilize the Stack Overflow knowledge-base as the basis of forming their recommendations. Traditionally, these systems have relied on the developer to explicitly invoke them, typically in the form of specifying a query. However, there may be cases in which the developer is in need of a recommendation but unaware that their need exists. A new class of recommendation systems deemed Behavior-Driven Recommendation Systems for Software Engineering seeks to address this issue by relying on developer behavior to determine when a recommendation is needed, and once such a determination is made, formulate a search query based on the software engineering task context. This thesis presents one such system, StackInTheFlow, a plug-in integrating into the IntelliJ family of Java IDEs. StackInTheFlow allows the user to intervi act with it as a traditional recommendation system, manually specifying queries and browsing returned Stack Overflow posts. However, it also provides facilities for detecting when the developer is in need of a recommendation, defined when the developer has encountered an error messages or a difficulty detection model based on indicators of developer progress is fired. Once such a determination has been made, a query formulation model constructed based on a periodic data dump of Stack Overflow posts will automatically form a query from the software engineering task context extracted from source code currently open within the IDE. StackInTheFlow also provides mechanisms to personalize, over time, the results displayed to a specific set of Stack Overflow tags based on the results previously selected by the user. The effectiveness of these mechanisms are examined and results based the collection of anonymous user logs and a small scale study are presented. Based on the results of these evaluations, it was found that some of the queries issued by the tool are effective, however there are limitations regarding the extraction of the appropriate context of the software engineering task yet to overcome
- …