1,183 research outputs found

    Improving the Flexibility of CLARA\u27s Automated Matching and Repair Processes

    Get PDF
    More computer science researchers focus on automated program repair, with the world steadfastly moving towards automation. CLARA is an example of an automated program repair tool that provides feedback to novice programmers solving introductory programming assignments in Java, C++, and Python. CLARA involves test-based repair, requiring as input a correct program, an incorrect program, and its corresponding test case. Our work only focuses on Python. CLARA has two main limitations. The first involves lacking support for commonly used language constructs such as standard input, standard output, and import statements. We address this issue by extending CLARA\u27s abstract syntax tree processor and interpreter to include these constructs. The second limitation is that CLARA requires both the correct and the incorrect program to have the same control flow. In a real-world setting, it is not easy to find such programs, reducing the true impact CLARA can have on the learning of novice programmers. Therefore, we implement a graph matching technique between the correct and incorrect programs that considers both the semantic and the topological information to help overcome this limitation. Using this matching, we modify the incorrect program to match its control flow with the correct program. To verify that our technique overcomes the control flow limitation, we conduct experiments to run CLARA and compare the number of programs repaired with and without the graph matching technique. We also analyze the percentage of the program modified by CLARA and the number of correct programs needed to repair all valid incorrect programs. Our experiments show that CLARA can parse, process, and repair many more programs after our extensions. Additionally, our experiments indicate that we never enable CLARA to replace all the source code of an incorrect program with all the source code of a correct program

    Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning Inference

    Full text link
    In modern computer science education, massive open online courses (MOOCs) log thousands of hours of data about how students solve coding challenges. Being so rich in data, these platforms have garnered the interest of the machine learning community, with many new algorithms attempting to autonomously provide feedback to help future students learn. But what about those first hundred thousand students? In most educational contexts (i.e. classrooms), assignments do not have enough historical data for supervised learning. In this paper, we introduce a human-in-the-loop "rubric sampling" approach to tackle the "zero shot" feedback challenge. We are able to provide autonomous feedback for the first students working on an introductory programming assignment with accuracy that substantially outperforms data-hungry algorithms and approaches human level fidelity. Rubric sampling requires minimal teacher effort, can associate feedback with specific parts of a student's solution and can articulate a student's misconceptions in the language of the instructor. Deep learning inference enables rubric sampling to further improve as more assignment specific student data is acquired. We demonstrate our results on a novel dataset from Code.org, the world's largest programming education platform.Comment: To appear at AAAI 2019; 9 page

    Program Repair with Minimal Edits Using CodeT5

    Full text link
    Programmers often struggle to identify and fix bugs in their programs. In recent years, many language models (LMs) have been proposed to fix erroneous programs and support error recovery. However, the LMs tend to generate solutions that differ from the original input programs. This leads to potential comprehension difficulties for users. In this paper, we propose an approach to suggest a correct program with minimal repair edits using CodeT5. We fine-tune a pre-trained CodeT5 on code pairs of wrong and correct programs and evaluate its performance with several baseline models. The experimental results show that the fine-tuned CodeT5 achieves a pass@100 of 91.95% and an average edit distance of the most similar correct program of 6.84, which indicates that at least one correct program can be suggested by generating 100 candidate programs. We demonstrate the effectiveness of LMs in suggesting program repair with minimal edits for solving introductory programming problems.Comment: 7 pages, 6 figures, accepted to iCAST 202
    • …
    corecore