Search CORE

3,030 research outputs found

Learning Web Development using GitHub Copilot in and outside Academia: a Blessing or a Curse?

Author: Mesaroš Gabriel Oliver
Publication venue: Croatian Interdisciplinary Society
Publication date: 01/01/2024
Field of study

This article investigates the usage of GitHub Copilot, an artificial intelligence-powered coding assistant owned by Microsoft and GitHub, in the process of learning and teaching web development both in formal academic, and informal settings. We dive into the idea behind utilizing GitHub Copilot and highlight its most common and relevant use cases which can be used to learn Web Development. Drawing from existing scientific literature and online statements from software professionals, we present an overview of the current situation with artificial intelligence-assisted programming tools such as GitHub Copilot and its impact and irrelevance on Web Development education especially for the early learning stages. Professionals both in and outside academia agree that usage of artificial intelligence Pair Programming tools such as GitHub Copilot is neither recommended nor essential when learning or teaching Web Development

Hrčak - Portal of scientific journals of Croatia

Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT

Author: Ayerdem Miray
Tüzün Eray
Yetiştiren Burak
Özsoy Işık
Publication venue
Publication date: 21/04/2023
Field of study

Context: AI-assisted code generation tools have become increasingly prevalent in software engineering, offering the ability to generate code from natural language prompts or partial code inputs. Notable examples of these tools include GitHub Copilot, Amazon CodeWhisperer, and OpenAI's ChatGPT. Objective: This study aims to compare the performance of these prominent code generation tools in terms of code quality metrics, such as Code Validity, Code Correctness, Code Security, Code Reliability, and Code Maintainability, to identify their strengths and shortcomings. Method: We assess the code generation capabilities of GitHub Copilot, Amazon CodeWhisperer, and ChatGPT using the benchmark HumanEval Dataset. The generated code is then evaluated based on the proposed code quality metrics. Results: Our analysis reveals that the latest versions of ChatGPT, GitHub Copilot, and Amazon CodeWhisperer generate correct code 65.2%, 46.3%, and 31.1% of the time, respectively. In comparison, the newer versions of GitHub CoPilot and Amazon CodeWhisperer showed improvement rates of 18% for GitHub Copilot and 7% for Amazon CodeWhisperer. The average technical debt, considering code smells, was found to be 8.9 minutes for ChatGPT, 9.1 minutes for GitHub Copilot, and 5.6 minutes for Amazon CodeWhisperer. Conclusions: This study highlights the strengths and weaknesses of some of the most popular code generation tools, providing valuable insights for practitioners. By comparing these generators, our results may assist practitioners in selecting the optimal tool for specific tasks, enhancing their decision-making process

arXiv.org e-Print Archive

The Impact of AI Tool on Engineering at ANZ Bank An Empirical Study on GitHub Copilot within Corporate Environment

Author: Chatterjee Sayan
Hogarth Tim
Liu Ching Louis
Rowland Gareth
Publication venue
Publication date: 17/04/2024
Field of study

The increasing popularity of AI, particularly Large Language Models (LLMs), has significantly impacted various domains, including Software Engineering. This study explores the integration of AI tools in software engineering practices within a large organization. We focus on ANZ Bank, which employs over 5000 engineers covering all aspects of the software development life cycle. This paper details an experiment conducted using GitHub Copilot, a notable AI tool, within a controlled environment to evaluate its effectiveness in real-world engineering tasks. Additionally, this paper shares initial findings on the productivity improvements observed after GitHub Copilot was adopted on a large scale, with about 1000 engineers using it. ANZ Bank's six-week experiment with GitHub Copilot included two weeks of preparation and four weeks of active testing. The study evaluated participant sentiment and the tool's impact on productivity, code quality, and security. Initially, participants used GitHub Copilot for proposed use-cases, with their feedback gathered through regular surveys. In the second phase, they were divided into Control and Copilot groups, each tackling the same Python challenges, and their experiences were again surveyed. Results showed a notable boost in productivity and code quality with GitHub Copilot, though its impact on code security remained inconclusive. Participant responses were overall positive, confirming GitHub Copilot's effectiveness in large-scale software engineering environments. Early data from 1000 engineers also indicated a significant increase in productivity and job satisfaction.Comment: 16 pages, 4 figures. in proceeding for 10th International Conference on Software Engineering (SEC 2024

arXiv.org e-Print Archive

On the Concerns of Developers When Using GitHub Copilot

Author: Ahmad Aakash
Li Zengyang
Liang Peng
Shahin Mojtaba
Waseem Muhammad
Zhang Beiqi
Zhou Xiyu
Publication venue
Publication date: 02/11/2023
Field of study

With the recent advancement of Artificial Intelligence (AI) and the emergence of Large Language Models (LLMs), AI-based code generation tools have achieved significant progress and become a practical solution for software development. GitHub Copilot, referred to as AI pair programmer, utilizes machine learning models that are trained on a large corpus of code snippets to generate code suggestions or auto-complete code using natural language processing. Despite its popularity, there is little empirical evidence on the actual experiences of software developers who work with Copilot. To this end, we conducted an empirical study to understand the issues and challenges that developers face when using Copilot in practice, as well as their underlying causes and potential solutions. We collected data from 476 GitHub issues, 706 GitHub discussions, and 184 Stack Overflow posts, and identified the issues, causes that trigger the issues, and solutions that resolve the issues when using Copilot. Our results reveal that (1) Usage Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Issue, Network Connection Issue, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we delve into the main challenges users encounter when implementing Copilot in practical development, the possible impact of Copilot on the coding process, aspects in which Copilot can be further enhanced, and potential new features desired by Copilot users

arXiv.org e-Print Archive

Developer Productivity With and Without GitHub Copilot: A Longitudinal Mixed-Methods Case Study

Author: Barbala Astri
Brandtzæg Elias Goldmann
Moe Nils Brede
Stray Viktoria
Wivestad Viggo
Publication venue
Publication date: 06/01/2026
Field of study

This study investigates the real-world impact of the generative AI (GenAI) tool GitHub Copilot on developer activity and perceived productivity. We conducted a mixed-methods case study in NAV IT, a large public sector agile organization. We analyzed 26,317 unique non-merge commits from 703 of NAV IT's GitHub repositories over a two-year period, focusing on commit-based activity metrics from 25 Copilot users and 14 non-users. The analysis was complemented by survey responses on their roles and perceived productivity, as well as 13 interviews. Our analysis of activity metrics revealed that individuals who used Copilot were consistently more active than non-users, even prior to Copilot’s introduction. We did not find any statistically significant changes in commit-based activity for Copilot users after they adopted the tool, although minor increases were observed. This suggests a discrepancy between changes in commit-based metrics and the subjective experience of productivity

ScholarSpace at University of Hawai'i at Manoa

Where Are Large Language Models for Code Generation on GitHub?

Author: Hu Xing
Keung Jacky Wai
Liu Jin
Liu Lei
Xia Xin
Yu Xiao
Publication venue
Publication date: 02/08/2024
Field of study

The increasing use of Large Language Models (LLMs) in software development has garnered significant attention from researchers assessing the quality of the code they generate. However, much of the research focuses on controlled datasets such as HumanEval, which fail to adequately represent how developers actually utilize LLMs' code generation capabilities or clarify the characteristics of LLM-generated code in real-world development scenarios. To bridge this gap, our study investigates the characteristics of LLM-generated code and its corresponding projects hosted on GitHub. Our findings reveal several key insights: (1) ChatGPT and Copilot are the most frequently utilized for generating code on GitHub. In contrast, there is very little code generated by other LLMs on GitHub. (2) Projects containing ChatGPT/Copilot-generated code are often small and less known, led by individuals or small teams. Despite this, most projects are continuously evolving and improving. (3) ChatGPT/Copilot is mainly utilized for generating Python, Java, and TypeScript scripts for data processing and transformation. C/C++ and JavaScript code generation focuses on algorithm and data structure implementation and user interface code. Most ChatGPT/Copilot-generated code snippets are relatively short and exhibit low complexity. (4) Compared to human-written code, ChatGPT/Copilot-generated code exists in a small proportion of projects and generally undergoes fewer modifications. Additionally, modifications due to bugs are even fewer, ranging from just 3% to 8% across different languages. (5) Most comments on ChatGPT/Copilot-generated code lack detailed information, often only stating the code's origin without mentioning prompts, human modifications, or testing status. Based on these findings, we discuss the implications for researchers and practitioners

arXiv.org e-Print Archive

GitHub Copilot: the perfect Code compLeeter?

Author: Preneel Bart
Singelée Dave
Siroš Ilja
Publication venue
Publication date: 17/06/2024
Field of study

This paper aims to evaluate GitHub Copilot's generated code quality based on the LeetCode problem set using a custom automated framework. We evaluate the results of Copilot for 4 programming languages: Java, C++, Python3 and Rust. We aim to evaluate Copilot's reliability in the code generation stage, the correctness of the generated code and its dependency on the programming language, problem's difficulty level and problem's topic. In addition to that, we evaluate code's time and memory efficiency and compare it to the average human results. In total, we generate solutions for 1760 problems for each programming language and evaluate all the Copilot's suggestions for each problem, resulting in over 50000 submissions to LeetCode spread over a 2-month period. We found that Copilot successfully solved most of the problems. However, Copilot was rather more successful in generating code in Java and C++ than in Python3 and Rust. Moreover, in case of Python3 Copilot proved to be rather unreliable in the code generation phase. We also discovered that Copilot's top-ranked suggestions are not always the best. In addition, we analysed how the topic of the problem impacts the correctness rate. Finally, based on statistics information from LeetCode, we can conclude that Copilot generates more efficient code than an average human.Comment: 10 pages, 6 figures. Code available: https://github.com/IljaSir/CopilotSolverForLeetCod

arXiv.org e-Print Archive

Measuring the Performance of Code Produced with GitHub Copilot

Author: Erhabor Daniel
Publication venue: 'University of Waterloo'
Publication date: 21/12/2022
Field of study

GitHub Copilot is an artificially intelligent programming assistant used by many developers. While a few studies have evaluated the security risks of using Copilot, there has not been any study to show if it aids developers in producing code with better performance. We evaluate the performance of code produced when developers use GitHub Copilot versus when they do not. To this end, we conducted a user study with 32 participants where each participant solved two C++ programming problems, one with Copilot and the other without it and measured the running time of the participants' solutions on our test data. Our results suggest that using Copilot can produce code with a significantly slower running time

University of Waterloo's Institutional Repository

Reverse Engineering GitHub CoPilot: Creating an OpenAI-Compatible Endpoint for Enhanced Developer Integration

Author: Akbar Nur Arifin
Krida Ardian Webi
Setiawan Akbar
Publication venue: Universitas Amikom Yogyakarta
Publication date: 31/12/2024
Field of study

This paper presents the reverse engineering of GitHub CoPilot to develop an OpenAI-compatible endpoint, enabling broader access and integration possibilities for AI-assisted code completion. By analyzing CoPilot\u27s communication protocols and creating a proxy server that translates OpenAI API requests to CoPilot\u27s internal API, we bridge the gap between proprietary tools and open standards. The implementation, allows developers to utilize CoPilot\u27s capabilities within their preferred environments using the familiar OpenAI API interface. We detail the system architecture, authentication mechanisms, request processing pipeline, and performance optimization techniques. Our results demonstrate successful integration, with robust performance metrics, including low response times and high compatibility rates. This work opens avenues for enhanced developer productivity and flexibility in AI-assisted coding tools

Jurnal Universitas Amikom Yogyakarta