338 research outputs found
Multi-Factors Aware Dual-Attentional Knowledge Tracing
With the increasing demands of personalized learning, knowledge tracing has
become important which traces students' knowledge states based on their
historical practices. Factor analysis methods mainly use two kinds of factors
which are separately related to students and questions to model students'
knowledge states. These methods use the total number of attempts of students to
model students' learning progress and hardly highlight the impact of the most
recent relevant practices. Besides, current factor analysis methods ignore rich
information contained in questions. In this paper, we propose Multi-Factors
Aware Dual-Attentional model (MF-DAKT) which enriches question representations
and utilizes multiple factors to model students' learning progress based on a
dual-attentional mechanism. More specifically, we propose a novel
student-related factor which records the most recent attempts on relevant
concepts of students to highlight the impact of recent exercises. To enrich
questions representations, we use a pre-training method to incorporate two
kinds of question information including questions' relation and difficulty
level. We also add a regularization term about questions' difficulty level to
restrict pre-trained question representations to fine-tuning during the process
of predicting students' performance. Moreover, we apply a dual-attentional
mechanism to differentiate contributions of factors and factor interactions to
final prediction in different practice records. At last, we conduct experiments
on several real-world datasets and results show that MF-DAKT can outperform
existing knowledge tracing methods. We also conduct several studies to validate
the effects of each component of MF-DAKT.Comment: Accepted by CIKM 2021, 10 pages, 10 figures, 6 table
Recommended from our members
Learning from Sequential User Data: Models and Sample-efficient Algorithms
Recent advances in deep learning have made learning representation from ever-growing datasets possible in the domain of vision, natural language processing (NLP), and robotics, among others. However, deep networks are notoriously data-hungry; for example, training language models with attention mechanisms sometimes requires trillions of parameters and tokens. In contrast, we can often access a limited number of samples in many tasks. It is crucial to learn models from these `limited\u27 datasets. Learning with limited datasets can take several forms. In this thesis, we study how to select data samples sequentially such that downstream task performance is maximized. Moreover, we study how to introduce prior knowledge in the deep networks to maximize prediction performance. We focus on four sequential tasks: computerized adaptive testing in psychometrics, sketching in recommender systems, knowledge tracing in computer-assisted education, and career path modeling in the labor market.
In the first two tasks, we devise novel sample-efficient algorithms to query a minimal number of sequential samples to improve future predictions. We propose a Bilevel Optimization-Based framework for computerized adaptive testing to learn a data-driven question selection algorithm that improves existing data selection policies. We also tackle the sketching problem in the recommender system, with the task of recommending the next item using a stored subset of prior data samples. In this setting, we develop a data-driven sequential selection algorithm that tackles evolving downstream task distribution. In the last two tasks, we devise novel neural models to introduce prior knowledge exploiting limited data samples. For knowledge tracing, we propose a novel neural architecture, inspired by cognitive and psychometric models, to improve the prediction of students\u27 future performance and utilize the labeled data samples efficiently. For career path modeling, we propose a novel and interpretable monotonic nonlinear state-space model to analyze online user professional profiles and provide actionable feedback and recommendations to users on how they can reach their career goals.
The data-driven differentiable data selection algorithms for the first two tasks open up future directions to query (a non-differentiable operation) a minimal number of samples optimally to maximize prediction performance. The structures, introduced in the neural architecture for the models in the last two tasks using prior knowledge, open up future directions to learn deep models augmented with prior knowledge using limited data samples
Incorporating Rich Features into Deep Knowledge Tracing
The desire to follow student learning within intelligent tutoring systems in near real time has led to the development of several models anticipating the correctness of the next item as students work through an assignment. Such models have in- cluded Bayesian Knowledge Tracing (BKT), Performance Factors Analysis (PFA), and more recently with developments in Deep Learning, Deep Knowledge Tracing (DKT). The DKT model, based on the use of a recurrent neural network, exhibited promising results in paper [PBH+15]. Thus far, however, the model has only considered the knowledge components of the problems and correctness as input, neglecting the breadth of other features col- lected by computer-based learning platforms. This work seeks to improve upon the DKT model by incorporating more features at the problem-level and student-level. With this higher dimensional input, an adaption to the original DKT model struc- ture is also proposed, incorporating an Autoencoder network layer to convert the input into a low dimensional feature vector to reduce both the resource requirement and time needed to train. Experimental results show that our adapted DKT model, which includes more combinations of features, can effectively improve accuracy
Student Modeling within a Computer Tutor for Mathematics: Using Bayesian Networks and Tabling Methods
Intelligent tutoring systems rely on student modeling to understand student behavior. The result of student modeling can provide assessment for student knowledge, estimation of student¡¯s current affective states (ie boredom, confusion, concentration, frustration, etc), prediction of student performance, and suggestion of the next tutoring steps. There are three focuses of this dissertation. The first focus is on better predicting student performance by adding more information, such as student identity and information about how many assistance students needed. The second focus is to analyze different performance and feature set for modeling student short-term knowledge and longer-term knowledge. The third focus is on improving the affect detectors by adding more features. In this dissertation I make contributions to the field of data mining as well as educational research. I demonstrate novel Bayesian networks for student modeling, and also compared them with each other. This work contributes to educational research by broadening the task of analyzing student knowledge to student knowledge retention, which is a much more important and interesting question for researchers to look at. Additionally, I showed a set of new useful features as well as how to effectively use these features in real models. For instance, in Chapter 5, I showed that the feature of the number of different days a students has worked on a skill is a more predictive feature for knowledge retention. These features themselves are not a contribution to data mining so much as they are to education research more broadly, which can used by other educational researchers or tutoring systems
Recommended from our members
The role of machine learning in personalised instructional sequencing for language learning
The origins of personalised instructional sequencing can be dated back to the times of the Ancient Greeks to the times of Alexander The Great's tutor, Aristotle. However, over the centuries the demand for education and growth of students has been disproportionately greater than the number of teachers in training. Therefore, there has been a longstanding interest in finding a way to scale education without negatively affecting learning outcomes. This interest was fuelled further with the advent of computers and artificial intelligence, where a plethora of systems and models were built to bring technology driven personalised instructional sequencing to the world. Unfortunately, results were far from groundbreaking and many challenges still remain.
In my thesis, I investigate three aspects of personalised instructional sequencing: the personalised instructional sequencing mechanism, the student knowledge representation, and human forgetting. While I do not cover the entirety of personalised instructional sequencing, I cover what I consider the foundational components. I link psychological theory to model selection and design in each of my systems and present experiments to illustrate their impact. I show how reinforcement learning can be used for vocabulary learning. I also present a model that uses neural collaborative filtering to learn student knowledge representations. Lastly, I present a state-of-the-art model to predict the probability of vocabulary word recall for students learning English as a second language. The system's novelty lies in the use of word complexity to adapt the forgetting curve as well as its incorporation of psychological theory to select an appropriate model
Deep Reinforcement Learning Approaches for Technology Enhanced Learning
Artificial Intelligence (AI) has advanced significantly in recent years, transforming various industries and domains. Its ability to extract patterns and insights from large volumes of data has revolutionised areas such as image recognition, natural language processing, and autonomous systems. As AI systems become increasingly integrated into daily human life, there is a growing need for meaningful collaboration and mutual engagement between humans and AI, known as Human-AI Collaboration. This collaboration involves combining AI with human workflows to achieve shared objectives.
In the current educational landscape, the integration of AI methods in Technology Enhanced Learning (TEL) has become crucial for providing high-quality education and facilitating lifelong learning. Human-AI Collaboration also plays a vital role in the field of Technology Enhanced Learning (TEL), particularly in Intelligent Tutoring Systems (ITS). The COVID-19 pandemic has further emphasised the need for effective educational technologies to support remote learning and bridge the gap between traditional classrooms and online platforms. To maximise the performance of ITS while minimising the input and interaction required from students, it is essential to design collaborative systems that effectively leverage the capabilities of AI and foster effective collaboration between students and ITS.
However, there are several challenges that need to be addressed in this context. One challenge is the lack of clear guidance on designing and building user-friendly systems that facilitate collaboration between humans and AI. This challenge is relevant not only to education researchers but also to Human-Computer Interaction (HCI) researchers and developers. Another challenge is the scarcity of interaction data in the early stages of ITS development, which hampers the accurate modelling of students' knowledge states and learning trajectories, known as the cold start problem. Moreover, the effectiveness of Intelligent Tutoring Systems (ITS) in delivering personalised instruction is hindered by the limitations of existing Knowledge Tracing (KT) models, which often struggle to provide accurate predictions. Therefore, addressing these challenges is crucial for enhancing the collaborative process between humans and AI in the development of ITS.
This thesis aims to address these challenges and improve the collaborative process between students and ITS in TEL. It proposes innovative approaches to generate simulated student behavioural data and enhance the performance of KT models. The thesis starts with a comprehensive survey of human-AI collaborative systems, identifying key challenges and opportunities. It then presents a structured framework for the student-ITS collaborative process, providing insights into designing user-friendly and efficient systems.
To overcome the challenge of data scarcity in ITS development, the thesis proposes two student modelling approaches: Sim-GAIL and SimStu. SimStu leverages a deep learning method, the Decision Transformer, to simulate student interactions and enhance ITS training. Sim-GAIL utilises a reinforcement learning method, Generative Adversarial Imitation Learning (GAIL), to generate high-fidelity and diverse simulated student behavioural data, addressing the cold start problem in ITS training.
Furthermore, the thesis focuses on improving the performance of KT models. It introduces the MLFBKT model, which integrates multiple features and mines latent relations in student interaction data, aiming to improve the accuracy and efficiency of KT models. Additionally, the thesis proposes the LBKT model, which combines the strengths of the BERT model and LSTM to process long sequence data in KT models effectively.
Overall, this thesis contributes to the field of Human-AI collaboration in TEL by addressing key challenges and proposing innovative approaches to enhance ITS training and KT model performance. The findings have the potential to improve the learning experiences and outcomes of students in educational settings
THE ROLE OF SIMULATION IN SUPPORTING LONGER-TERM LEARNING AND MENTORING WITH TECHNOLOGY
Mentoring is an important part of professional development and longer-term learning. The nature of longer-term mentoring contexts means that designing, developing, and testing adaptive learning sys-tems for use in this kind of context would be very costly as it would require substantial amounts of fi-nancial, human, and time resources. Simulation is a cheaper and quicker approach for evaluating the impact of various design and development decisions. Within the Artificial Intelligence in Education (AIED) research community, however, surprisingly little attention has been paid to how to design, de-velop, and use simulations in longer-term learning contexts. The central challenge is that adaptive learning system designers and educational practitioners have limited guidance on what steps to consider when designing simulations for supporting longer-term mentoring system design and development deci-sions.
My research work takes as a starting point VanLehn et al.’s [1] introduction to applications of simulated students and Erickson et al.’s [2] suggested approach to creating simulated learning envi-ronments. My dissertation presents four research directions using a real-world longer-term mentoring context, a doctoral program, for illustrative purposes. The first direction outlines a framework for guid-ing system designers as to what factors to consider when building pedagogical simulations, fundamen-tally to answer the question: how can a system designer capture a representation of a target learning context in a pedagogical simulation model? To illustrate the feasibility of this framework, this disserta-tion describes how to build, the SimDoc model, a pedagogical model of a longer-term mentoring learn-ing environment – a doctoral program. The second direction builds on the first, and considers the issue of model fidelity, essentially to answer the question: how can a system designer determine a simulation model’s fidelity to the desired granularity level? This dissertation shows how data from a target learning environment, the research literature, and common sense are combined to achieve SimDoc’s medium fidelity model. The third research direction explores calibration and validation issues to answer the question: how many simulation runs does it take for a practitioner to have confidence in the simulation model’s output? This dissertation describes the steps taken to calibrate and validate the SimDoc model, so its output statistically matches data from the target doctoral program, the one at the university of Saskatchewan. The fourth direction is to demonstrate the applicability of the resulting pedagogical model. This dissertation presents two experiments using SimDoc to illustrate how to explore pedagogi-cal questions concerning personalization strategies and to determine the effectiveness of different men-toring strategies in a target learning context.
Overall, this dissertation shows that simulation is an important tool in the AIED system design-ers’ toolkit as AIED moves towards designing, building, and evaluating AIED systems meant to support learners in longer-term learning and mentoring contexts. Simulation allows a system designer to exper-iment with various design and implementation decisions in a cost-effective and timely manner before committing to these decisions in the real world
Towards Personalized Learning using Counterfactual Inference for Randomized Controlled Trials
Personalized learning considers that the causal effects of a studied learning intervention may differ for the individual student (e.g., maybe girls do better with video hints while boys do better with text hints). To evaluate a learning intervention inside ASSISTments, we run a randomized control trial (RCT) by randomly assigning students into either a control condition or a treatment condition. Making the inference about causal effects of studies interventions is a central problem. Counterfactual inference answers “What if� questions, such as Would this particular student benefit more if the student were given the video hint instead of the text hint when the student cannot solve a problem? . Counterfactual prediction provides a way to estimate the individual treatment effects and helps us to assign the students to a learning intervention which leads to a better learning. A variant of Michael Jordan\u27s Residual Transfer Networks was proposed for the counterfactual inference. The model first uses feed-forward neural networks to learn a balancing representation of students by minimizing the distance between the distributions of the control and the treated populations, and then adopts a residual block to estimate the individual treatment effect. Students in the RCT usually have done a number of problems prior to participating it. Each student has a sequence of actions (performance sequence). We proposed a pipeline to use the performance sequence to improve the performance of counterfactual inference. Since deep learning has achieved a huge amount of success in learning representations from raw logged data, student representations were learned by applying the sequence autoencoder to performance sequences. Then, incorporate these representations into the model for counterfactual inference. Empirical results showed that the representations learned from the sequence autoencoder improved the performance of counterfactual inference
Creating Systems and Applying Large-Scale Methods to Improve Student Remediation in Online Tutoring Systems in Real-time and at Scale
A common problem shared amongst online tutoring systems is the time-consuming nature of content creation. It has been estimated that an hour of online instruction can take up to 100-300 hours to create. Several systems have created tools to expedite content creation, such as the Cognitive Tutors Authoring Tool (CTAT) and the ASSISTments builder. Although these tools make content creation more efficient, they all still depend on the efforts of a content creator and/or past historical. These tools do not take full advantage of the power of the crowd. These issues and challenges faced by online tutoring systems provide an ideal environment to implement a solution using crowdsourcing. I created the PeerASSIST system to provide a solution to the challenges faced with tutoring content creation. PeerASSIST crowdsources the work students have done on problems inside the ASSISTments online tutoring system and redistributes that work as a form of tutoring to their peers, who are in need of assistance. Multi-objective multi-armed bandit algorithms are used to distribute student work, which balance exploring which work is good and exploiting the best currently known work. These policies are customized to run in a real-world environment with multiple asynchronous reward functions and an infinite number of actions. Inspired by major companies such as Google, Facebook, and Bing, PeerASSIST is also designed as a platform for simultaneous online experimentation in real-time and at scale. Currently over 600 teachers (grades K-12) are requiring students to show their work. Over 300,000 instances of student work have been collected from over 18,000 students across 28,000 problems. From the student work collected, 2,000 instances have been redistributed to over 550 students who needed help over the past few months. I conducted a randomized controlled experiment to evaluate the effectiveness of PeerASSIST on student performance. Other contributions include representing learning maps as Bayesian networks to model student performance, creating a machine-learning algorithm to derive student incorrect processes from their incorrect answer and the inputs of the problem, and applying Bayesian hypothesis testing to A/B experiments. We showed that learning maps can be simplified without practical loss of accuracy and that time series data is necessary to simplify learning maps if the static data is highly correlated. I also created several interventions to evaluate the effectiveness of the buggy messages generated from the machine-learned incorrect processes. The null results of these experiments demonstrate the difficulty of creating a successful tutoring and suggest that other methods of tutoring content creation (i.e. PeerASSIST) should be explored
- …