13 research outputs found
MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
Since the introduction of the original BERT (i.e., BASE BERT), researchers
have developed various customized BERT models with improved performance for
specific domains and tasks by exploiting the benefits of transfer learning. Due
to the nature of mathematical texts, which often use domain specific vocabulary
along with equations and math symbols, we posit that the development of a new
BERT model for mathematics would be useful for many mathematical downstream
tasks. In this resource paper, we introduce our multi-institutional effort
(i.e., two learning platforms and three academic institutions in the US) toward
this need: MathBERT, a model created by pre-training the BASE BERT model on a
large mathematical corpus ranging from pre-kindergarten (pre-k), to
high-school, to college graduate level mathematical content. In addition, we
select three general NLP tasks that are often used in mathematics education:
prediction of knowledge component, auto-grading open-ended Q&A, and knowledge
tracing, to demonstrate the superiority of MathBERT over BASE BERT. Our
experiments show that MathBERT outperforms prior best methods by 1.2-22% and
BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific
vocabulary 'mathVocab' to train with MathBERT. We discover that MathBERT
pre-trained with 'mathVocab' outperforms MathBERT trained with the BASE BERT
vocabulary (i.e., 'origVocab'). MathBERT is currently being adopted at the
participated leaning platforms: Stride, Inc, a commercial educational resource
provider, and ASSISTments.org, a free online educational platform. We release
MathBERT for public usage at: https://github.com/tbs17/MathBERT.Comment: Accepted by NeurIPS 2021 MATHAI4ED Workshop (Best Paper
Integrating Personalized Learning into Online Education through Content Aggregation, Data Mining, and Reinforcement Learning
Personalized learning stems from the idea that students benefit from instructional material tailored to their needs. While the concept of giving each student the content that helps them learn the most is straightforward, implementing this at scale requires overcoming a gauntlet of challenges. One must aggregate a breadth of content such that enough variety exists to support each students’ specific preferences, calculate quantifiable aspects of students’ behaviors and traits that correlate with which content is most effective for them, design metrics that accurately measure learning, and create algorithms that can learn the relationships between students’ features and the effects of different content on their learning across thousands of students in real time. This dissertation discusses different approaches for collecting, interpreting, and recommending instructional content to students with a focus on learning interpretable insight that can inform educational pedagogy outside of online learning as well as within it. Ultimately, we designed a content recommendation algorithm that performed equivalently or better than similar existing algorithms while also allowing for unbiased statistical analysis of the data
Dynamic Tuning MQP
Cyther V3 looks to improve Cyther V2, a mechatronic string instrument equipped with ten strings and a set of solenoids to actuate the strings. The goal of this project was to create a next generation Cyther equipped with a system that can autonomously tune each string during a performance, expanding on the types of musical expressions Cyther V2 was capable of. The tuning system senses string tension, estimates pitch, adjusts the tension, and corrects for errors in estimation using optical pickups. The frequency analysis accuracy and speed, and the tuning accuracy and speed of the new autonomous tuning system was analyzed for a single string to determine the quality of the new autonomous tuning system...
Identifying Struggling Students by Comparing Online Tutor Clickstreams
New ways to identify students in need of assistance are imperative to the evolution of online tutoring platforms. Currently implemented models to identify struggling students use costly and tedious classroom observation paired with student's platform usage, and are often suitable for only a subset of students. With the recent influx of new students to online tutoring platforms due to COVID-19, a simple method to quickly identify struggling students could help facilitate effective remote learning. To this end, we created an anomaly detection algorithm that models the normal behavior of students during remote learning and recognizes when students deviate from this behavior. We demonstrated how anomalous behavior not only revealed which students needed additional assistance, but also helped predict student learning outcomes and reduced the confidence intervals in research experiments performed within the online tutoring platform
Investigating the Impact of Skill-Related Videos on Online Learning Data and Code
Data and code for the paper "Investigating the Impact of Skill-Related Videos on Online Learning"
A Bandit you can Trust Data and Code
Data and code for the paper "A Bandit you can Trust"
Cyther: a human-playable, self-tuning robotic zither
Human-robot musical interaction typically consists of independent, physically-separated agents. We developed Cyther - a human-playable, self-tuning robotic zither – to allow a human and a robot to interact cooperatively through the same physical medium to generate music. The resultant co-dependence creates new responsibilities, roles, and expressive possibilities for human musicians. We describe some of these possibilities in the context of both technical features and artistic implementations of the system
Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models Data and Code
A dataset of tutoring chat logs, and the data and code for the paper "Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models"
Effective Evaluation of Online Learning Interventions with Surrogate Measures Data and Code
This project contains all the data and code used to get the results in the paper "Identifying Effective Proximal Surrogate Measures of Learning within Online Educational Experiments"