Search CORE

13 research outputs found

MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education

Author: Graff Ben
Heffernan Neil
Lee Dongwon
Prihar Ethan
Shen Jia Tracy
Wu Xintao
Yamashita Michiharu
Publication venue
Publication date: 17/12/2021
Field of study

Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this resource paper, we introduce our multi-institutional effort (i.e., two learning platforms and three academic institutions in the US) toward this need: MathBERT, a model created by pre-training the BASE BERT model on a large mathematical corpus ranging from pre-kindergarten (pre-k), to high-school, to college graduate level mathematical content. In addition, we select three general NLP tasks that are often used in mathematics education: prediction of knowledge component, auto-grading open-ended Q&A, and knowledge tracing, to demonstrate the superiority of MathBERT over BASE BERT. Our experiments show that MathBERT outperforms prior best methods by 1.2-22% and BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific vocabulary 'mathVocab' to train with MathBERT. We discover that MathBERT pre-trained with 'mathVocab' outperforms MathBERT trained with the BASE BERT vocabulary (i.e., 'origVocab'). MathBERT is currently being adopted at the participated leaning platforms: Stride, Inc, a commercial educational resource provider, and ASSISTments.org, a free online educational platform. We release MathBERT for public usage at: https://github.com/tbs17/MathBERT.Comment: Accepted by NeurIPS 2021 MATHAI4ED Workshop (Best Paper

arXiv.org e-Print Archive

Integrating Personalized Learning into Online Education through Content Aggregation, Data Mining, and Reinforcement Learning

Author: Prihar Ethan
Publication venue: Worcester Polytechnic Institute - Gordon Library
Publication date: 30/03/2023
Field of study

Personalized learning stems from the idea that students benefit from instructional material tailored to their needs. While the concept of giving each student the content that helps them learn the most is straightforward, implementing this at scale requires overcoming a gauntlet of challenges. One must aggregate a breadth of content such that enough variety exists to support each students’ specific preferences, calculate quantifiable aspects of students’ behaviors and traits that correlate with which content is most effective for them, design metrics that accurately measure learning, and create algorithms that can learn the relationships between students’ features and the effects of different content on their learning across thousands of students in real time. This dissertation discusses different approaches for collecting, interpreting, and recommending instructional content to students with a focus on learning interpretable insight that can inform educational pedagogy outside of online learning as well as within it. Ultimately, we designed a content recommendation algorithm that performed equivalently or better than similar existing algorithms while also allowing for unbiased statistical analysis of the data

Digital WPI

Dynamic Tuning MQP

Author: Prihar Ethan B
Publication venue: Worcester Polytechnic Institute - Gordon Library
Publication date: 23/01/2017
Field of study

Cyther V3 looks to improve Cyther V2, a mechatronic string instrument equipped with ten strings and a set of solenoids to actuate the strings. The goal of this project was to create a next generation Cyther equipped with a system that can autonomously tune each string during a performance, expanding on the types of musical expressions Cyther V2 was capable of. The tuning system senses string tension, estimates pitch, adjusts the tension, and corrects for errors in estimation using optical pickups. The frequency analysis accuracy and speed, and the tuning accuracy and speed of the new autonomous tuning system was analyzed for a single string to determine the quality of the new autonomous tuning system...

Digital WPI

Identifying Struggling Students by Comparing Online Tutor Clickstreams

Author: Prihar Ethan B.
Publication venue: Worcester Polytechnic Institute - Gordon Library
Publication date: 02/03/2021
Field of study

New ways to identify students in need of assistance are imperative to the evolution of online tutoring platforms. Currently implemented models to identify struggling students use costly and tedious classroom observation paired with student's platform usage, and are often suitable for only a subset of students. With the recent influx of new students to online tutoring platforms due to COVID-19, a simple method to quickly identify struggling students could help facilitate effective remote learning. To this end, we created an anomaly detection algorithm that models the normal behavior of students during remote learning and recognizes when students deviate from this behavior. We demonstrated how anomalous behavior not only revealed which students needed additional assistance, but also helped predict student learning outcomes and reduced the confidence intervals in research experiments performed within the online tutoring platform

Digital WPI

Investigating the Impact of Skill-Related Videos on Online Learning Data and Code

Author: Ethan Prihar
Neil Heffernan
Publication venue: OSF
Publication date: 31/01/2023
Field of study

Data and code for the paper "Investigating the Impact of Skill-Related Videos on Online Learning"

OSF Preprints

A Bandit you can Trust Data and Code

Author: Ethan Prihar
Neil Heffernan
Publication venue: OSF
Publication date: 21/04/2023
Field of study

Data and code for the paper "A Bandit you can Trust"

OSF Preprints

Deep Learning or Deep Ignorance? Comparing Untrained Recurrent Models in Educational Contexts Code

Author: Anthony F Botelho
Ethan Prihar
Publication venue: 'Center for Open Science'
Publication date: 23/05/2022
Field of study

OSF Preprints

Cyther: a human-playable, self-tuning robotic zither

Author: Barton Scott
Carvalho Paulo
Prihar Ethan
Publication venue
Publication date: 01/05/2017
Field of study

Human-robot musical interaction typically consists of independent, physically-separated agents. We developed Cyther - a human-playable, self-tuning robotic zither – to allow a human and a robot to interact cooperatively through the same physical medium to generate music. The resultant co-dependence creates new responsibilities, roles, and expressive possibilities for human musicians. We describe some of these possibilities in the context of both technical features and artistic implementations of the system

Digital WPI

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models Data and Code

Author: Ethan Prihar
Mia Hopman
Morgan Lee
Neil Heffernan
Publication venue: OSF
Publication date: 21/02/2023
Field of study

A dataset of tutoring chat logs, and the data and code for the paper "Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models"

OSF Preprints

Effective Evaluation of Online Learning Interventions with Surrogate Measures Data and Code

Author: Adam Sales
Ethan Prihar
Kirk Vanacore
Neil Heffernan
Publication venue: OSF
Publication date: 23/04/2023
Field of study

This project contains all the data and code used to get the results in the paper "Identifying Effective Proximal Surrogate Measures of Learning within Online Educational Experiments"

OSF Preprints