3 research outputs found
A Trustworthy Automated Short-Answer Scoring System Using a New Dataset and Hybrid Transfer Learning Method
To measure the quality of student learning, teachers must conduct evaluations. One of the most efficient modes of evaluation is the short answer question. However, there can be inconsistencies in teacher-performed manual evaluations due to an excessive number of students, time demands, fatigue, etc. Consequently, teachers require a trustworthy system capable of autonomously and accurately evaluating student answers. Using hybrid transfer learning and student answer dataset, we aim to create a reliable automated short answer scoring system called Hybrid Transfer Learning for Automated Short Answer Scoring (HTL-ASAS). HTL-ASAS combines multiple tokenizers from a pretrained model with the bidirectional encoder representations from transformers. Based on our evaluation of the training model, we determined that HTL-ASAS has a higher evaluation accuracy than models used in previous studies. The accuracy of HTL-ASAS for datasets containing responses to questions pertaining to introductory information technology courses reaches 99.6%. With an accuracy close to one hundred percent, the developed model can undoubtedly serve as the foundation for a trustworthy ASAS system
Evaluating ChatGPT's Decimal Skills and Feedback Generation in a Digital Learning Game
While open-ended self-explanations have been shown to promote robust learning
in multiple studies, they pose significant challenges to automated grading and
feedback in technology-enhanced learning, due to the unconstrained nature of
the students' input. Our work investigates whether recent advances in Large
Language Models, and in particular ChatGPT, can address this issue. Using
decimal exercises and student data from a prior study of the learning game
Decimal Point, with more than 5,000 open-ended self-explanation responses, we
investigate ChatGPT's capability in (1) solving the in-game exercises, (2)
determining the correctness of students' answers, and (3) providing meaningful
feedback to incorrect answers. Our results showed that ChatGPT can respond well
to conceptual questions, but struggled with decimal place values and number
line problems. In addition, it was able to accurately assess the correctness of
75% of the students' answers and generated generally high-quality feedback,
similar to human instructors. We conclude with a discussion of ChatGPT's
strengths and weaknesses and suggest several venues for extending its use cases
in digital teaching and learning.Comment: Be accepted as a Research Paper in 18th European Conference on
Technology Enhanced Learnin
A Systematic Review on Reproducibility in Child-Robot Interaction
Research reproducibility - i.e., rerunning analyses on original data to
replicate the results - is paramount for guaranteeing scientific validity.
However, reproducibility is often very challenging, especially in research
fields where multi-disciplinary teams are involved, such as child-robot
interaction (CRI). This paper presents a systematic review of the last three
years (2020-2022) of research in CRI under the lens of reproducibility, by
analysing the field for transparency in reporting. Across a total of 325
studies, we found deficiencies in reporting demographics (e.g. age of
participants), study design and implementation (e.g. length of interactions),
and open data (e.g. maintaining an active code repository). From this analysis,
we distill a set of guidelines and provide a checklist to systematically report
CRI studies to help and guide research to improve reproducibility in CRI and
beyond