13 research outputs found

    MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education

    Full text link
    Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this resource paper, we introduce our multi-institutional effort (i.e., two learning platforms and three academic institutions in the US) toward this need: MathBERT, a model created by pre-training the BASE BERT model on a large mathematical corpus ranging from pre-kindergarten (pre-k), to high-school, to college graduate level mathematical content. In addition, we select three general NLP tasks that are often used in mathematics education: prediction of knowledge component, auto-grading open-ended Q&A, and knowledge tracing, to demonstrate the superiority of MathBERT over BASE BERT. Our experiments show that MathBERT outperforms prior best methods by 1.2-22% and BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific vocabulary 'mathVocab' to train with MathBERT. We discover that MathBERT pre-trained with 'mathVocab' outperforms MathBERT trained with the BASE BERT vocabulary (i.e., 'origVocab'). MathBERT is currently being adopted at the participated leaning platforms: Stride, Inc, a commercial educational resource provider, and ASSISTments.org, a free online educational platform. We release MathBERT for public usage at: https://github.com/tbs17/MathBERT.Comment: Accepted by NeurIPS 2021 MATHAI4ED Workshop (Best Paper

    Integrating Personalized Learning into Online Education through Content Aggregation, Data Mining, and Reinforcement Learning

    No full text
    Personalized learning stems from the idea that students benefit from instructional material tailored to their needs. While the concept of giving each student the content that helps them learn the most is straightforward, implementing this at scale requires overcoming a gauntlet of challenges. One must aggregate a breadth of content such that enough variety exists to support each students’ specific preferences, calculate quantifiable aspects of students’ behaviors and traits that correlate with which content is most effective for them, design metrics that accurately measure learning, and create algorithms that can learn the relationships between students’ features and the effects of different content on their learning across thousands of students in real time. This dissertation discusses different approaches for collecting, interpreting, and recommending instructional content to students with a focus on learning interpretable insight that can inform educational pedagogy outside of online learning as well as within it. Ultimately, we designed a content recommendation algorithm that performed equivalently or better than similar existing algorithms while also allowing for unbiased statistical analysis of the data

    Dynamic Tuning MQP

    Get PDF
    Cyther V3 looks to improve Cyther V2, a mechatronic string instrument equipped with ten strings and a set of solenoids to actuate the strings. The goal of this project was to create a next generation Cyther equipped with a system that can autonomously tune each string during a performance, expanding on the types of musical expressions Cyther V2 was capable of. The tuning system senses string tension, estimates pitch, adjusts the tension, and corrects for errors in estimation using optical pickups. The frequency analysis accuracy and speed, and the tuning accuracy and speed of the new autonomous tuning system was analyzed for a single string to determine the quality of the new autonomous tuning system...

    Identifying Struggling Students by Comparing Online Tutor Clickstreams

    No full text
    New ways to identify students in need of assistance are imperative to the evolution of online tutoring platforms. Currently implemented models to identify struggling students use costly and tedious classroom observation paired with student's platform usage, and are often suitable for only a subset of students. With the recent influx of new students to online tutoring platforms due to COVID-19, a simple method to quickly identify struggling students could help facilitate effective remote learning. To this end, we created an anomaly detection algorithm that models the normal behavior of students during remote learning and recognizes when students deviate from this behavior. We demonstrated how anomalous behavior not only revealed which students needed additional assistance, but also helped predict student learning outcomes and reduced the confidence intervals in research experiments performed within the online tutoring platform

    Investigating the Impact of Skill-Related Videos on Online Learning Data and Code

    No full text
    Data and code for the paper "Investigating the Impact of Skill-Related Videos on Online Learning"

    A Bandit you can Trust Data and Code

    No full text
    Data and code for the paper "A Bandit you can Trust"

    Cyther: a human-playable, self-tuning robotic zither

    No full text
    Human-robot musical interaction typically consists of independent, physically-separated agents. We developed Cyther - a human-playable, self-tuning robotic zither – to allow a human and a robot to interact cooperatively through the same physical medium to generate music. The resultant co-dependence creates new responsibilities, roles, and expressive possibilities for human musicians. We describe some of these possibilities in the context of both technical features and artistic implementations of the system

    Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models Data and Code

    No full text
    A dataset of tutoring chat logs, and the data and code for the paper "Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models"

    Effective Evaluation of Online Learning Interventions with Surrogate Measures Data and Code

    No full text
    This project contains all the data and code used to get the results in the paper "Identifying Effective Proximal Surrogate Measures of Learning within Online Educational Experiments"
    corecore