4,187 research outputs found
Verifix: Verified Repair of Programming Assignments
Automated feedback generation for introductory programming assignments is
useful for programming education. Most works try to generate feedback to
correct a student program by comparing its behavior with an instructor's
reference program on selected tests. In this work, our aim is to generate
verifiably correct program repairs as student feedback. The student assignment
is aligned and composed with a reference solution in terms of control flow, and
differences in data variables are automatically summarized via predicates to
relate the variable names. Failed verification attempts for the equivalence of
the two programs are exploited to obtain a collection of maxSMT queries, whose
solutions point to repairs of the student assignment. We have conducted
experiments on student assignments curated from a widely deployed intelligent
tutoring system. Our results indicate that we can generate verified feedback in
up to 58% of the assignments. More importantly, our system indicates when it is
able to generate a verified feedback, which is then usable by novice students
with high confidence
Re-factoring based program repair applied to programming assignments
Automated program repair has been used to provide feedback for incorrect student programming assignments, since program repair captures the code modification needed to make a given buggy program pass a given test-suite. Existing student feedback generation techniques are limited because they either require manual effort in the form of providing an error model, or require a large number of correct student submissions to learn from, or suffer from lack of scalability and accuracy. In this work, we propose a fully automated approach for generating student program repairs in real-time. This is achieved by first re-factoring all available correct solutions to semantically equivalent solutions. Given an incorrect program, we match the program with the closest matching refactored program based on its control flow structure. Subsequently, we infer the input-output specifications of the incorrect program's basic blocks from the executions of the correct program's aligned basic blocks. Finally, these specifications are used to modify the blocks of the incorrect program via search-based synthesis. Our dataset consists of almost 1,800 real-life incorrect Python program submissions from 361 students for an introductory programming course at a large public university. Our experimental results suggest that our method is more effective and efficient than recently proposed feedback generation approaches. About 30% of the patches produced by our tool Refactory are smaller than those produced by the state-of-art tool Clara, and can be produced given fewer correct solutions (often a single correct solution) and in a shorter time. We opine that our method is applicable not only to programming assignments, and could be seen as a general-purpose program repair method that can achieve good results with just a single correct reference solution
Intelligent Tutoring System: Experience of Linking Software Engineering and Programming Teaching
The increasing number of computer science students pushes lecturers and
tutors of first-year programming courses to their limits to provide
high-quality feedback to the students. Existing systems that handle automated
grading primarily focus on the automation of test case executions in the
context of programming assignments. However, they cannot provide customized
feedback about the students' errors, and hence, cannot replace the help of
tutors. While recent research works in the area of automated grading and
feedback generation address this issue by using automated repair techniques, so
far, to the best of our knowledge, there has been no real-world deployment of
such techniques. Based on the research advances in recent years, we have built
an intelligent tutoring system that has the capability of providing automated
feedback and grading. Furthermore, we designed a Software Engineering course
that guides third-year undergraduate students in incrementally developing such
a system over the coming years. Each year, students will make contributions
that improve the current implementation, while at the same time, we can deploy
the current system for usage by first year students. This paper describes our
teaching concept, the intelligent tutoring system architecture, and our
experience with the stakeholders. This software engineering project for the
students has the key advantage that the users of the system are available
in-house (i.e., students, tutors, and lecturers from the first-year programming
courses). This helps organize requirements engineering sessions and builds
awareness about their contribution to a "to be deployed" software project. In
this multi-year teaching effort, we have incrementally built a tutoring system
that can be used in first-year programming courses. Further, it represents a
platform that can integrate the latest research results in APR for education
Introductory programming: a systematic literature review
As computing becomes a mainstream discipline embedded in the school curriculum and acts as an enabler for an increasing range of academic disciplines in higher education, the literature on introductory programming is growing. Although there have been several reviews that focus on specific aspects of introductory programming, there has been no broad overview of the literature exploring recent trends across the breadth of introductory programming.
This paper is the report of an ITiCSE working group that conducted a systematic review in order to gain an overview of the introductory programming literature. Partitioning the literature into papers addressing the student, teaching, the curriculum, and assessment, we explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research
Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming
The rapid evolution of artificial intelligence (AI), specifically large
language models (LLMs), has opened opportunities for various educational
applications. This paper explored the feasibility of utilizing ChatGPT, one of
the most popular LLMs, for automating feedback for Java programming assignments
in an introductory computer science (CS1) class. Specifically, this study
focused on three questions: 1) To what extent do students view LLM-generated
feedback as formative? 2) How do students see the comparative affordances of
feedback prompts that include their code, vs. those that exclude it? 3) What
enhancements do students suggest for improving AI-generated feedback? To
address these questions, we generated automated feedback using the ChatGPT API
for four lab assignments in the CS1 class. The survey results revealed that
students perceived the feedback as aligning well with formative feedback
guidelines established by Shute. Additionally, students showed a clear
preference for feedback generated by including the students' code as part of
the LLM prompt, and our thematic study indicated that the preference was mainly
attributed to the specificity, clarity, and corrective nature of the feedback.
Moreover, this study found that students generally expected specific and
corrective feedback with sufficient code examples, but had diverged opinions on
the tone of the feedback. This study demonstrated that ChatGPT could generate
Java programming assignment feedback that students perceived as formative. It
also offered insights into the specific improvements that would make the
ChatGPT-generated feedback useful for students
AI-Enhanced Auto-Correction of Programming Exercises: How Effective is GPT-3.5?
Timely formative feedback is considered as one of the most important drivers for effective learning. Delivering timely and individualized feedback is particularly challenging in large classes in higher education. Recently Large Language Models such as GPT-3 became available to the public that showed promising results on various tasks such as code generation and code explanation. This paper investigates the potential of AI in providing personalized code correction and generating feedback. Based on existing student submissions of two different real-world assignments, the correctness of the AI-aided e-assessment as well as the characteristics such as fault localization, correctness of hints, and code style suggestions of the generated feedback are investigated. The results show that 73% of the submissions were correctly identified as either correct or incorrect. In 59% of these cases, GPT-3.5 also successfully generated effective and high-quality feedback. Additionally, GPT-3.5 exhibited weaknesses in its evaluation, including localization of errors that were not the actual errors, or even hallucinated errors. Implications and potential new usage scenarios are discussed
- …