thesis

Auto grading tool for introductory programming courses

Abstract

Using automated grading tools to provide feedback to students is common in Computer Science education. The first step of automated grading is to find defects in the student program. However, finding bugs in code has never been easy. Current automated grading tools do not focus on identifying defects inside student code. Comparing computation results using a fixed set of test cases is still the most common way to determine correctness among current automated grading tools. It takes time and effort to design a good set of test cases that can test the student code thoroughly. In practice, tests used for grading are often insufficient for accurate diagnosis. Meanwhile, automated testing tools have been developing for some time. Even though it still takes some effort to apply automated testing tools to real software development, we believe that automated testing tools are ready for automated feedback generation in the classroom. The reason is that for classroom assignments, the code is relatively simple. A well understood reference implementation provided by the instructor also makes automated testing tools more effective. In this thesis, we present our utilization of industrial automated testing on student assignments in an introductory programming course. We implemented a framework to collect student codes and apply industrial automated testing to their codes. Then we interpret the results obtained from testing in a way that students can understand easily. Furthermore, we use the test results to classify erroneous student codes into different categories. Instructors can use the category information to address the most common conceptual errors efficiently. We deployed our framework on five different introductory C programming assignments here at the University of Illinois at Urbana-Champaign. The results show that the automated feedback generation framework can discover more errors inside student submissions and can provide timely and useful feedback to both instructors and students. A total of 142 missed bugs are found within 446 submissions. More than 50% of students receive their feedback within 3 minutes of submission. By doing grouping on one of the assignments with 91 submissions, two groups of student submissions of 15 and 6 are identified to have the same type of error. The average grading code setup time is estimated to be less than 8 hours for each assignment. We believe that based on the current automated testing tools, an automated feedback framework for the classroom can benefit both students and instructors, thus improving Computer Science education

    Similar works