1,183 research outputs found
Improving the Flexibility of CLARA\u27s Automated Matching and Repair Processes
More computer science researchers focus on automated program repair, with the world steadfastly moving towards automation. CLARA is an example of an automated program repair tool that provides feedback to novice programmers solving introductory programming assignments in Java, C++, and Python. CLARA involves test-based repair, requiring as input a correct program, an incorrect program, and its corresponding test case. Our work only focuses on Python. CLARA has two main limitations. The first involves lacking support for commonly used language constructs such as standard input, standard output, and import statements. We address this issue by extending CLARA\u27s abstract syntax tree processor and interpreter to include these constructs. The second limitation is that CLARA requires both the correct and the incorrect program to have the same control flow. In a real-world setting, it is not easy to find such programs, reducing the true impact CLARA can have on the learning of novice programmers. Therefore, we implement a graph matching technique between the correct and incorrect programs that considers both the semantic and the topological information to help overcome this limitation. Using this matching, we modify the incorrect program to match its control flow with the correct program. To verify that our technique overcomes the control flow limitation, we conduct experiments to run CLARA and compare the number of programs repaired with and without the graph matching technique. We also analyze the percentage of the program modified by CLARA and the number of correct programs needed to repair all valid incorrect programs. Our experiments show that CLARA can parse, process, and repair many more programs after our extensions. Additionally, our experiments indicate that we never enable CLARA to replace all the source code of an incorrect program with all the source code of a correct program
Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning Inference
In modern computer science education, massive open online courses (MOOCs) log
thousands of hours of data about how students solve coding challenges. Being so
rich in data, these platforms have garnered the interest of the machine
learning community, with many new algorithms attempting to autonomously provide
feedback to help future students learn. But what about those first hundred
thousand students? In most educational contexts (i.e. classrooms), assignments
do not have enough historical data for supervised learning. In this paper, we
introduce a human-in-the-loop "rubric sampling" approach to tackle the "zero
shot" feedback challenge. We are able to provide autonomous feedback for the
first students working on an introductory programming assignment with accuracy
that substantially outperforms data-hungry algorithms and approaches human
level fidelity. Rubric sampling requires minimal teacher effort, can associate
feedback with specific parts of a student's solution and can articulate a
student's misconceptions in the language of the instructor. Deep learning
inference enables rubric sampling to further improve as more assignment
specific student data is acquired. We demonstrate our results on a novel
dataset from Code.org, the world's largest programming education platform.Comment: To appear at AAAI 2019; 9 page
Program Repair with Minimal Edits Using CodeT5
Programmers often struggle to identify and fix bugs in their programs. In
recent years, many language models (LMs) have been proposed to fix erroneous
programs and support error recovery. However, the LMs tend to generate
solutions that differ from the original input programs. This leads to potential
comprehension difficulties for users. In this paper, we propose an approach to
suggest a correct program with minimal repair edits using CodeT5. We fine-tune
a pre-trained CodeT5 on code pairs of wrong and correct programs and evaluate
its performance with several baseline models. The experimental results show
that the fine-tuned CodeT5 achieves a pass@100 of 91.95% and an average edit
distance of the most similar correct program of 6.84, which indicates that at
least one correct program can be suggested by generating 100 candidate
programs. We demonstrate the effectiveness of LMs in suggesting program repair
with minimal edits for solving introductory programming problems.Comment: 7 pages, 6 figures, accepted to iCAST 202
- …