99 research outputs found
Extending Item Response Theory to Online Homework
Item Response Theory becomes an increasingly important tool when analyzing
``Big Data'' gathered from online educational venues. However, the mechanism
was originally developed in traditional exam settings, and several of its
assumptions are infringed upon when deployed in the online realm. For a large
enrollment physics course for scientists and engineers, the study compares
outcomes from IRT analyses of exam and homework data, and then proceeds to
investigate the effects of each confounding factor introduced in the online
realm. It is found that IRT yields the correct trends for learner ability and
meaningful item parameters, yet overall agreement with exam data is moderate.
It is also found that learner ability and item discrimination is over wide
ranges robust with respect to model assumptions and introduced noise, less so
than item difficulty
Can an AI-tool grade assignments in an introductory physics course?
Problem solving is an integral part of any physics curriculum, and most
physics instructors would likely agree that the associated learner competencies
are best assessed by considering the solution path: not only the final solution
matters, but also how the learner arrived there. Unfortunately, providing
meaningful feedback on written derivations is much more labor and resource
intensive than only grading the outcome: currently, the latter can be done by
computer, while the former involves handwritten solutions that need to be
graded by humans. This exploratory study proposes an AI-assisted workflow for
grading written physics-problem solutions, and it evaluates the viability of
the actual grading step using GPT-4. It is found that the AI-tool is capable of
providing feedback that can be helpful in formative assessment scenarios, but
that for summative scenarios, particularly those that are high-stakes, it
should only be used for an initial round of grading that sorts and flags
solution approaches
Using artificial-intelligence tools to make LaTeX content accessible to blind readers
Screen-reader software enables blind users to access large segments of
electronic content, particularly if accessibility standards are followed.
Unfortunately, this is not true for much of the content written in physics,
mathematics, and other STEM-disciplines, due to the strong reliance on
mathematical symbols and expressions, which screen-reader software generally
fails to process correctly. A large portion of such content is based on source
documents written in LaTeX, which are rendered to PDF or HTML for online
distribution. Unfortunately, the resulting PDF documents are essentially
inaccessible, and the HTML documents greatly vary in accessibility, since their
rendering using standard tools is cumbersome at best. The paper explores the
possibility of generating standards-compliant, accessible HTML from LaTeX
sources using Large Language Models. It is found that the resulting documents
are highly accessible, with possible complications occurring when the
artificial intelligence tool starts to interpret the content
Performance of the Pre-Trained Large Language Model GPT-4 on Automated Short Answer Grading
Automated Short Answer Grading (ASAG) has been an active area of
machine-learning research for over a decade. It promises to let educators grade
and give feedback on free-form responses in large-enrollment courses in spite
of limited availability of human graders. Over the years, carefully trained
models have achieved increasingly higher levels of performance. More recently,
pre-trained Large Language Models (LLMs) emerged as a commodity, and an
intriguing question is how a general-purpose tool without additional training
compares to specialized models. We studied the performance of GPT-4 on the
standard benchmark 2-way and 3-way datasets SciEntsBank and Beetle, where in
addition to the standard task of grading the alignment of the student answer
with a reference answer, we also investigated withholding the reference answer.
We found that overall, the performance of the pre-trained general-purpose GPT-4
LLM is comparable to hand-engineered models, but worse than pre-trained LLMs
that had specialized training
Moving towards a language arts program extended across the curriculum
A literature-based program;integrates oral and written language activity with other areas of the curriculum. Therefore, the curriculum is centered around real ideas, relevant issues, and problem solving. As a result; children have opportunities to experiment with language that can lead to higher levels of competencies (Goodman, 1986; Smith, 1994).
As teachers plan units in social studies and the sciences, they make note of relevant language activities. These activities can be teacher-directed or presented in learning centers (Harms & Lettow, 1992)
Causality Violations in Cascade Models of Nuclear Collisions
Transport models have successfully described many aspects of intermediate
energy heavy-ion collision dynamics. As the energies increase in these models
to the ultrarelativistic regime, Lorentz covariance and causality are not
strictly respected. The standard argument is that such effects are not
important to final results; but they have not been seriously considered at high
energies. We point out how and why these happen, how serious of a problem they
may be and suggest ways of reducing or eliminating the undesirable effects.Comment: RevTeX, 23 pages, 9 (uuencoded) figures; to appear in Phys. Rev
- …