99 research outputs found

    Extending Item Response Theory to Online Homework

    Full text link
    Item Response Theory becomes an increasingly important tool when analyzing ``Big Data'' gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is over wide ranges robust with respect to model assumptions and introduced noise, less so than item difficulty

    Can an AI-tool grade assignments in an introductory physics course?

    Full text link
    Problem solving is an integral part of any physics curriculum, and most physics instructors would likely agree that the associated learner competencies are best assessed by considering the solution path: not only the final solution matters, but also how the learner arrived there. Unfortunately, providing meaningful feedback on written derivations is much more labor and resource intensive than only grading the outcome: currently, the latter can be done by computer, while the former involves handwritten solutions that need to be graded by humans. This exploratory study proposes an AI-assisted workflow for grading written physics-problem solutions, and it evaluates the viability of the actual grading step using GPT-4. It is found that the AI-tool is capable of providing feedback that can be helpful in formative assessment scenarios, but that for summative scenarios, particularly those that are high-stakes, it should only be used for an initial round of grading that sorts and flags solution approaches

    Using artificial-intelligence tools to make LaTeX content accessible to blind readers

    Full text link
    Screen-reader software enables blind users to access large segments of electronic content, particularly if accessibility standards are followed. Unfortunately, this is not true for much of the content written in physics, mathematics, and other STEM-disciplines, due to the strong reliance on mathematical symbols and expressions, which screen-reader software generally fails to process correctly. A large portion of such content is based on source documents written in LaTeX, which are rendered to PDF or HTML for online distribution. Unfortunately, the resulting PDF documents are essentially inaccessible, and the HTML documents greatly vary in accessibility, since their rendering using standard tools is cumbersome at best. The paper explores the possibility of generating standards-compliant, accessible HTML from LaTeX sources using Large Language Models. It is found that the resulting documents are highly accessible, with possible complications occurring when the artificial intelligence tool starts to interpret the content

    Performance of the Pre-Trained Large Language Model GPT-4 on Automated Short Answer Grading

    Full text link
    Automated Short Answer Grading (ASAG) has been an active area of machine-learning research for over a decade. It promises to let educators grade and give feedback on free-form responses in large-enrollment courses in spite of limited availability of human graders. Over the years, carefully trained models have achieved increasingly higher levels of performance. More recently, pre-trained Large Language Models (LLMs) emerged as a commodity, and an intriguing question is how a general-purpose tool without additional training compares to specialized models. We studied the performance of GPT-4 on the standard benchmark 2-way and 3-way datasets SciEntsBank and Beetle, where in addition to the standard task of grading the alignment of the student answer with a reference answer, we also investigated withholding the reference answer. We found that overall, the performance of the pre-trained general-purpose GPT-4 LLM is comparable to hand-engineered models, but worse than pre-trained LLMs that had specialized training

    Moving towards a language arts program extended across the curriculum

    Get PDF
    A literature-based program;integrates oral and written language activity with other areas of the curriculum. Therefore, the curriculum is centered around real ideas, relevant issues, and problem solving. As a result; children have opportunities to experiment with language that can lead to higher levels of competencies (Goodman, 1986; Smith, 1994). As teachers plan units in social studies and the sciences, they make note of relevant language activities. These activities can be teacher-directed or presented in learning centers (Harms & Lettow, 1992)

    Causality Violations in Cascade Models of Nuclear Collisions

    Get PDF
    Transport models have successfully described many aspects of intermediate energy heavy-ion collision dynamics. As the energies increase in these models to the ultrarelativistic regime, Lorentz covariance and causality are not strictly respected. The standard argument is that such effects are not important to final results; but they have not been seriously considered at high energies. We point out how and why these happen, how serious of a problem they may be and suggest ways of reducing or eliminating the undesirable effects.Comment: RevTeX, 23 pages, 9 (uuencoded) figures; to appear in Phys. Rev
    corecore