Validation of Question Classification Using Support Vector Machine and Intraclass Correlation Coefficient Based on the Revised Bloom’s Taxonomy

Abstract

The assessment process must be carried out accurately as it is a crucial aspect of identifying cognitive abilities in students. Cognitive ability identification needs to be done by providing exam questions that refer to the Revised Bloom's Taxonomy for difficulty-level classification to ensure students' understanding of what has been taught. The traditional manual classification process carried out by educators often requires significant time and is susceptible to subjective variability. The classification of questions from levels C1 to C6 based on the Revised Bloom's Taxonomy shows an imbalance in the data distribution for each level, leading to inaccurate classification results. The automatic classification technique using the SVM algorithm allows educators to quickly classify questions based on their difficulty levels. The automated classification technique needs to be validated to what extent the difficulty levels classified by the machine align with the perceptions of educators and students. This research will validate the results of question classification generated from the SVM algorithm, supplemented by the oversampling technique to address data imbalance. The validation method used is ICC. Applying the SMOTE oversampling technique to handle a class imbalance in the training data shows improvement, with an accuracy rate of 91% when using SMOTE compared to 83% without it. Results of the classification suitability test with the SVM algorithm by educators and students indicate a high level of agreement. The ICC Average Measures values are as follows: SVM classification is 0,979, assessment by non-science subject educators is 0,956, assessment by science subject educators is 0,991, assessment by non-science subject students is 0,982, and assessment by science subject students is 0,984. ICC testing consistently yields excellent results in non-science and science subjects, indicating that the assessments conducted by educators and students have a very high level of agreement

Similar works

This paper was published in Jurnal Teknik Informatika (JUTIF).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: https://creativecommons.org/licenses/by/4.0