20,598 research outputs found
Anchor selection strategies for DIF analysis: Review, assessment, and new approaches
Differential item functioning (DIF) indicates the violation of the invariance assumption for instance in models based on item response theory (IRT). For item-wise DIF analysis using IRT, a common metric for the item parameters of the groups that are to be compared (e.g. for the reference and the focal group) is necessary. In the Rasch model, therefore, the same linear restriction is imposed in both groups. Items in the restriction are termed the anchor items. Ideally, these items are DIF-free to avoid artificially augmented false alarm rates. However, the question how DIF-free anchor items are selected appropriately is still a major challenge. Furthermore, various authors point out the lack of new anchor selection strategies and the lack of a comprehensive study especially for dichotomous IRT models. This article reviews existing anchor selection strategies that do not require any knowledge prior to DIF analysis, offers a straightforward notation and proposes three new anchor selection strategies. An extensive simulation study is conducted to compare the performance of the anchor selection strategies. The results show that an appropriate anchor selection is crucial for suitable item-wise DIF analysis. The newly suggested anchor selection strategies outperform the existing strategies and can reliably locate a suitable anchor when the sample sizes are large enough
Score Allotment Optimization Method with Application to Comparison of Ability Evaluation in Testing between Classical Test Theory and Item Response Theory
Many researchers know the superiority of the item response theory (IRT) over the classical test theory (CTT) from a detailed test-evaluation viewpoint. However, teachers are still reluctant to use the IRT as a daily testing tool. The primary objective of this paper is to nd the difference between the CTT and the IRT. In particular, we focus on the difference in ability evaluation. We compared the CTT and IRT evaluated abilities by using the hypothetically assumed abilities that are mimicked to a real case. By using a simulation study, we found that the IRT is superior to the CTT to some extent. The CTT uses pre-assigned allotments contrary to the IRT which has no allotment concept. However, if we regard the ability evaluation by the IRT as the standard, we can nd the most appropriate allotments in the CTT so that the total scores of the CTT are adjusted as close as possible to the abilities obtained by the IRT. This is a kind of allotment optimization problem. We show the methodology in this paper. By applying our methodology to some simulation cases that mimic the real data case, we found an intriguing feature with respect to the pre-assigned allotments. If teachers want to raise the examination pass rate, we guess that they give higher scores than the actual scores achieved by students; we call this jacking-up. Using the allotment optimization, we have found that jacking-up causes higher allotments to easier problems in the CTT
Automatic control study of the icing research tunnel refrigeration system
The Icing Research Tunnel (IRT) at the NASA Lewis Research Center is a subsonic, closed-return atmospheric tunnel. The tunnel includes a heat exchanger and a refrigeration plant to achieve the desired air temperature and a spray system to generate the type of icing conditions that would be encountered by aircraft. At the present time, the tunnel air temperature is controlled by manual adjustment of freon refrigerant flow control valves. An upgrade of this facility calls for these control valves to be adjusted by an automatic controller. The digital computer simulation of the IRT refrigeration plant and the automatic controller that was used in the simulation are discussed
lordif: An R Package for Detecting Differential Item Functioning Using Iterative Hybrid Ordinal Logistic Regression/Item Response Theory and Monte Carlo Simulations
Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group--specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient--Reported Outcomes Measurement Information System.
Detecting Differential Item and Step Functioning with Rating Scale and Partial Credit Trees
Several statistical procedures have been suggested for detecting
differential item functioning (DIF) and differential step
functioning (DSF) in polytomous items. However, standard
procedures are designed for the comparison of pre-specified
reference and focal groups, such as males and females.
Here, we propose a framework for the detection of DIF and DSF in
polytomous items under the rating scale and partial credit model,
that employs a model-based recursive partitioning algorithm. In contrast to existing
procedures, with this approach no pre-specification of reference
and focal groups is necessary, because they are detected in a
data-driven way. The resulting groups are characterized by
(combinations of) covariates and thus directly interpretable.
The statistical background and construction of the new procedures
are introduced along with an instructive example. Four simulation
studies illustrate and compare their statistical properties to
the well-established likelihood ratio test (LRT). While both the
LRT and the new procedures respect a given significance level,
the new procedures are in most cases equally (simple DIF groups)
or more powerful (complex DIF groups) and can also detect
DSF. The sensitivity to model misspecification is
investigated. An application example with empirical data
illustrates the practical use.
A software implementation of the new procedures is freely
available in the R system for statistical computing
Influences on classification accuracy of exam sets: an example from vocational education and training
Classification accuracy of single exams is well studied in the educational measurement literature. However, when making important decisions, such as certification decisions, one usually uses several exams: an exam set. This chapter elaborates on classification accuracy of exam sets. This is influenced by the shape of the ability distribution, the height of the standards, and the possibility for compensation. This is studied using an example from vocational education and training (VET). The classification accuracy for an exam set is computed using item response theory (IRT) simulation. Classification accuracy is high when all exams from an exam set have equal and standardized ability distributions. Furthermore, exams where few or no students pass or fail increase classification accuracy. Finally, allowing compensation increases classification accurac
Fast vectorized algorithm for the Monte Carlo Simulation of the Random Field Ising Model
An algoritm for the simulation of the 3--dimensional random field Ising model
with a binary distribution of the random fields is presented. It uses
multi-spin coding and simulates 64 physically different systems simultaneously.
On one processor of a Cray YMP it reaches a speed of 184 Million spin updates
per second. For smaller field strength we present a version of the algorithm
that can perform 242 Million spin updates per second on the same machine.Comment: 13 pp., HLRZ 53/9
- …
