3,551 research outputs found
Recommended from our members
The role of machine learning in personalised instructional sequencing for language learning
The origins of personalised instructional sequencing can be dated back to the times of the Ancient Greeks to the times of Alexander The Great's tutor, Aristotle. However, over the centuries the demand for education and growth of students has been disproportionately greater than the number of teachers in training. Therefore, there has been a longstanding interest in finding a way to scale education without negatively affecting learning outcomes. This interest was fuelled further with the advent of computers and artificial intelligence, where a plethora of systems and models were built to bring technology driven personalised instructional sequencing to the world. Unfortunately, results were far from groundbreaking and many challenges still remain.
In my thesis, I investigate three aspects of personalised instructional sequencing: the personalised instructional sequencing mechanism, the student knowledge representation, and human forgetting. While I do not cover the entirety of personalised instructional sequencing, I cover what I consider the foundational components. I link psychological theory to model selection and design in each of my systems and present experiments to illustrate their impact. I show how reinforcement learning can be used for vocabulary learning. I also present a model that uses neural collaborative filtering to learn student knowledge representations. Lastly, I present a state-of-the-art model to predict the probability of vocabulary word recall for students learning English as a second language. The system's novelty lies in the use of word complexity to adapt the forgetting curve as well as its incorporation of psychological theory to select an appropriate model
Enhancing Deep Neural Networks Testing by Traversing Data Manifold
We develop DEEPTRAVERSAL, a feedback-driven framework to test DNNs.
DEEPTRAVERSAL first launches an offline phase to map media data of various
forms to manifolds. Then, in its online testing phase, DEEPTRAVERSAL traverses
the prepared manifold space to maximize DNN coverage criteria and trigger
prediction errors. In our evaluation, DNNs executing various tasks (e.g.,
classification, self-driving, machine translation) and media data of different
types (image, audio, text) were used. DEEPTRAVERSAL exhibits better performance
than prior methods with respect to popular DNN coverage criteria and it can
discover a larger number and higher quality of error-triggering inputs. The
tested DNN models, after being repaired with findings of DEEPTRAVERSAL, achieve
better accurac
Multilingual Grammatical Error Detection And Its Applications to Prompt-Based Correction
Grammatical Error Correction (GEC) and Grammatical Error Correction (GED) are two important tasks in the study of writing assistant technologies. Given an input sentence, the former aims to output a corrected version of the sentence, while the latter's goal is to indicate in which words of the sentence errors occur. Both tasks are relevant for real-world applications that help native speakers and language learners to write better. Naturally, these two areas have attracted the attention of the research community and have been studied in the context of modern neural networks. This work focuses on the study of multilingual GED models and how they can be used to improve GEC performed by large language models (LLMs).
We study the difference in performance between GED models trained in a single language and models that undergo multilingual training. We expand the list of datasets used for multilingual GED to further experiment with cross-dataset and cross-lingual generalization of detection models. Our results go against previous findings and indicate that multilingual GED models are as good as monolingual ones when evaluated in the in-domain languages. Furthermore, multilingual models show better generalization to novel languages seen only at test time.
Making use of the GED models we study, we propose two methods to improve corrections of prompt-based GEC using LLMs. The first method aims to mitigate overcorrection by using a detection model to determine if a sentence has any mistakes before feeding it to the LLM. The second method uses the sequence of GED tags to select the in-context examples provided in the prompt. We perform experiments in English, Czech, German and Russian, using Llama2 and GPT3.5. The results show that both methods increase the performance of prompt-based GEC and point to a promising direction of using GED models as part of the correction pipeline performed by LLMs
Recommended from our members
Modelling text meta-properties in automated text scoring for non-native English writing
Automated text scoring (ATS) is the task of automatically scoring a text based on some given grading criteria. This thesis focuses on ATS in the context of free-text writing exams aimed at learners of English as a foreign language (EFL). The benefit of an ATS system is primarily to provide instant and consistent feedback to language learners, and service reliability also forms a crucial part of an ATS system. Based on previous work, we investigated only partially explored meta-properties in text and integrated them into a machine learning based ATS model across multiple datasets:
In most previous work, the proposed models implicitly assume that texts produced by learners in an exam are written independently. However, this is not true for the exams where learners are required to compose multiple texts. We hence explicitly instructed our model which texts are written by the same learner, which boosts model performance in most cases.
We used three intra-exam properties within the same exam including prompt, genre and task as a starting point, and we showed that explicitly modelling these properties via frustratingly easy domain adaptation (FEDA) can positively affect model performance in some cases. Furthermore, modelling multiple intra-exam properties together is better than modelling any single property individually or no property in four out of five test sets.
We studied how to utilise and combine learners' responses from multiple writing exams. We also proposed a new variant of the transfer-learning ATS model which mitigates the drawbacks of previous work. This variant first builds a ranking model across multiple datasets via FEDA, and the ranking score of each text predicted by the ranking model is used as an extra feature in the baseline model. This variants gives improvement compared to a baseline model on the development sets in terms of root-mean-square error. Furthermore, the transfer-learning model utilising multiple datasets tuned on each development set is always better than the baseline model on the corresponding test set.
We found that different datasets favour different meta properties. We therefore combined all the models looking at different meta properties together using ensemble learning. Compared to the baseline model, the combined model has a statistically significant improvement on all the test sets in terms of root-mean-square error based on a permutation test.The Institute for Automated Language Teaching and Assessmen
- …