38 research outputs found

    Deduplicating and Ranking Solution Programs for Suggesting Reference Solutions

    Full text link
    Referring to solution programs written by other users is helpful for learners in programming education. However, current online judge systems just list all solution programs submitted by users for references, and the programs are sorted based on the submission date and time, execution time, or user rating, ignoring to what extent the programs can be helpful to be referenced. In addition, users struggle to refer to a variety of solution approaches since there are too many duplicated and near-duplicated programs. To motivate learners to refer to various solutions to learn better solution approaches, in this paper, we propose an approach to deduplicate and rank common solution programs in each programming problem. Inspired by the nature that the many-duplicated program adopts a more common approach and can be a general reference, we remove the near-duplicated solution programs and rank the unique programs based on the duplicate count. The experiments on the solution programs submitted to a real-world online judge system demonstrate that the number of programs is reduced by 60.20%, whereas the baseline only reduces by 29.59% after the deduplication, meaning that users only need to refer to 39.80% of programs on average. Furthermore, our analysis shows that top-10 ranked programs cover 29.95% of programs on average, indicating that users can grasp 29.95% of solution approaches by referring to only 10 programs. The proposed approach shows the potential of reducing the learners' burden of referring to too many solutions and motivating them to learn a variety of solution approaches.Comment: 7 pages, 5 figures, accepted to ASSE 202

    Composing control flow and formula rules for computing on grids

    Get PDF
    We define computation on grids as the composition, through pushout constructions, of control flows, carried across adjacency relations between grid cells, with formulas updating the value of some attribute. The approach is based on the identification of a subcategory of attributed typed graphs suitable to the definition of pushouts on grids, and is illustrated in the context of the Cyberfilm visual language

    Exploring Automated Code Evaluation Systems and Resources for Code Analysis: A Comprehensive Survey

    Full text link
    The automated code evaluation system (AES) is mainly designed to reliably assess user-submitted code. Due to their extensive range of applications and the accumulation of valuable resources, AESs are becoming increasingly popular. Research on the application of AES and their real-world resource exploration for diverse coding tasks is still lacking. In this study, we conducted a comprehensive survey on AESs and their resources. This survey explores the application areas of AESs, available resources, and resource utilization for coding tasks. AESs are categorized into programming contests, programming learning and education, recruitment, online compilers, and additional modules, depending on their application. We explore the available datasets and other resources of these systems for research, analysis, and coding tasks. Moreover, we provide an overview of machine learning-driven coding tasks, such as bug detection, code review, comprehension, refactoring, search, representation, and repair. These tasks are performed using real-life datasets. In addition, we briefly discuss the Aizu Online Judge platform as a real example of an AES from the perspectives of system design (hardware and software), operation (competition and education), and research. This is due to the scalability of the AOJ platform (programming education, competitions, and practice), open internal features (hardware and software), attention from the research community, open source data (e.g., solution codes and submission documents), and transparency. We also analyze the overall performance of this system and the perceived challenges over the years

    Refactoring Programs Using Large Language Models with Few-Shot Examples

    Full text link
    A less complex and more straightforward program is a crucial factor that enhances its maintainability and makes writing secure and bug-free programs easier. However, due to its heavy workload and the risks of breaking the working programs, programmers are reluctant to do code refactoring, and thus, it also causes the loss of potential learning experiences. To mitigate this, we demonstrate the application of using a large language model (LLM), GPT-3.5, to suggest less complex versions of the user-written Python program, aiming to encourage users to learn how to write better programs. We propose a method to leverage the prompting with few-shot examples of the LLM by selecting the best-suited code refactoring examples for each target programming problem based on the prior evaluation of prompting with the one-shot example. The quantitative evaluation shows that 95.68% of programs can be refactored by generating 10 candidates each, resulting in a 17.35% reduction in the average cyclomatic complexity and a 25.84% decrease in the average number of lines after filtering only generated programs that are semantically correct. Furthermore, the qualitative evaluation shows outstanding capability in code formatting, while unnecessary behaviors such as deleting or translating comments are also observed.Comment: 10 pages, 10 figures, accepted to the 30th Asia-Pacific Software Engineering Conference (APSEC 2023

    Rule-Based Error Classification for Analyzing Differences in Frequent Errors

    Full text link
    Finding and fixing errors is a time-consuming task not only for novice programmers but also for expert programmers. Prior work has identified frequent error patterns among various levels of programmers. However, the differences in the tendencies between novices and experts have yet to be revealed. From the knowledge of the frequent errors in each level of programmers, instructors will be able to provide helpful advice for each level of learners. In this paper, we propose a rule-based error classification tool to classify errors in code pairs consisting of wrong and correct programs. We classify errors for 95,631 code pairs and identify 3.47 errors on average, which are submitted by various levels of programmers on an online judge system. The classified errors are used to analyze the differences in frequent errors between novice and expert programmers. The analyzed results show that, as for the same introductory problems, errors made by novices are due to the lack of knowledge in programming, and the mistakes are considered an essential part of the learning process. On the other hand, errors made by experts are due to misunderstandings caused by the carelessness of reading problems or the challenges of solving problems differently than usual. The proposed tool can be used to create error-labeled datasets and for further code-related educational research.Comment: 7 pages, 4 figures, accepted to TALE 202

    Program Repair with Minimal Edits Using CodeT5

    Full text link
    Programmers often struggle to identify and fix bugs in their programs. In recent years, many language models (LMs) have been proposed to fix erroneous programs and support error recovery. However, the LMs tend to generate solutions that differ from the original input programs. This leads to potential comprehension difficulties for users. In this paper, we propose an approach to suggest a correct program with minimal repair edits using CodeT5. We fine-tune a pre-trained CodeT5 on code pairs of wrong and correct programs and evaluate its performance with several baseline models. The experimental results show that the fine-tuned CodeT5 achieves a pass@100 of 91.95% and an average edit distance of the most similar correct program of 6.84, which indicates that at least one correct program can be suggested by generating 100 candidate programs. We demonstrate the effectiveness of LMs in suggesting program repair with minimal edits for solving introductory programming problems.Comment: 7 pages, 6 figures, accepted to iCAST 202

    ChatGPT for Education and Research: Opportunities, Threats, and Strategies

    No full text
    In recent years, the rise of advanced artificial intelligence technologies has had a profound impact on many fields, including education and research. One such technology is ChatGPT, a powerful large language model developed by OpenAI. This technology offers exciting opportunities for students and educators, including personalized feedback, increased accessibility, interactive conversations, lesson preparation, evaluation, and new ways to teach complex concepts. However, ChatGPT poses different threats to the traditional education and research system, including the possibility of cheating on online exams, human-like text generation, diminished critical thinking skills, and difficulties in evaluating information generated by ChatGPT. This study explores the potential opportunities and threats that ChatGPT poses to overall education from the perspective of students and educators. Furthermore, for programming learning, we explore how ChatGPT helps students improve their programming skills. To demonstrate this, we conducted different coding-related experiments with ChatGPT, including code generation from problem descriptions, pseudocode generation of algorithms from texts, and code correction. The generated codes are validated with an online judge system to evaluate their accuracy. In addition, we conducted several surveys with students and teachers to find out how ChatGPT supports programming learning and teaching. Finally, we present the survey results and analysis

    Efficient visualisation of the relative distribution of keyword search results in a corpus data cube

    No full text
    Most keyword searches target precision for finding the most relevant document. However some target recall, finding all relevant documents. Our system supports high recall searches that return hundreds or thousands of relevant results. In particular, it provides a visualization that shows the distribution of search results relative to the distribution of items for the entire corpus. Such relative distributional features include over and under representation, clusters and outliers. The contribution of this paper is efficient visualisation, that is, how to provide the best relative distribution view for a given data cube size. This requirement is translated to: for which limited size meta-data summary cube are search results disambiguated the most in our relative distribution view. We identify metrics and several algorithms for such a summary cube selection
    corecore