Search CORE

185 research outputs found

Outlier detection in high-stakes college entrance testing

Author: Meijer R.R.
Publication venue: Law School Admission Council
Publication date: 01/09/2005
Field of study

In this study we discuss recent developments of person-fit analysis in the context of computerized adaptive testing (CAT). Methods from statistical process control are discussed that have been proposed to classify an item score pattern as fitting or misfitting the underlying item response theory (IRT) model in a CAT. Most person-fit research in CAT is restricted to simulated data. In this study, empirical data from a high-stakes test are used. Alternative methods to generate norm distributions to allow the determination of bounds are discussed. These bounds may be used to classify item score patterns as fitting or misfitting. Using bounds determined from the sample, the empirical analysis indicated that different types of misfit can be distinguished. Possibilities to use this method as a diagnostic instrument are discussed

University of Twente Research Information

Simple nonparametric checks for model data fit in CAT

Author: Meijer R.R.
Publication venue: Law School Admission Council
Publication date: 01/12/2005
Field of study

In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item pools. Although IRT models are strongly embedded in the development and construction of CAT, the development of CAT is strongly related to parametric as opposed to nonparametric IRT modeling. This is not surprising because one of the key features of a CAT is the item selection procedure on the basis of an estimated latent trait from a calibrated item pool. Parametric IRT models enable the separate estimation of item and person parameters and, thus, facilitate this process enormously. The recent developments in nonparametric IRT, however, also suggest that techniques and statistics used in this IRT field may contribute to the development and improvement of the psychometric quality of a CAT. Investigating nonparametric IRT modeling may also help us to gain insight into the assumptions underlying CAT and may help to unify IRT modeling

University of Twente Research Information

Investigating the quality of items in CAT using nonparametric IRT

Author: Meijer R.R.
Publication venue: Law School Admission Council
Publication date: 01/03/2004
Field of study

I discuss the applicability of nonparametric item response theory (IRT) models to the quality of item pool development in the context of CAT, and I contrast these models with parametric IRT models. I also show how nonparametric IRT models can easily be applied and how misleading results from parametric IRT models can be avoided. I recommend the use of nonparametric IRT modeling to routinely investigate the quality of item pools

University of Twente Research Information

Robustness of person-fit decisions in computerized adaptive testing

Author: Meijer R.R.
Publication venue: Law School Admission Council
Publication date: 01/11/2005
Field of study

Person-fit statistics test whether or not the likelihood of a respondent’s complete vector of item scores on a test is low given the hypothesized item response theory (IRT) model. This binary information may be insufficient for diagnosing the cause of a misfitting item-score vector. This paper applies different types of person-fit analysis in a computer adaptive testing context and investigates the robustness of several methods to multidimensional test data. Both global person-fit statistics to make the binary decision about fit or misfit of a person’s item-score vector and local checks are applied. Results showed that there are differences between the methods with respect to the robustness in a multidimensional context and that some methods are more useful than othermethods

University of Twente Research Information

Divisie D: Measurement and reseach methodology

Author: Glas C.A.W.
Meijer R.R.
Publication venue: Wolters-Noordhoff
Publication date: 01/01/1999
Field of study

University of Twente Research Information

Nonparametric item response theory and related topics

Author: Meijer R.R.
Sijtsma K.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Tilburg University Repository

Detection of advance item knowledge using response times in computer adaptive testing

Author: Meijer R.R.
Sotaridona Leonardo
Publication venue: Law School Admission Council
Publication date: 01/01/2006
Field of study

We propose a new method for detecting item preknowledge in a CAT based on an estimate of “effective response time” for each item. Effective response time is defined as the time required for an individual examinee to answer an item correctly. An unusually short response time relative to the expected effective response time may be an indicator of item preknowledge. The new method was applied to empirical data. Results showed that the Type I error rate of the statistic can be controlled. Power analysis revealed that the power is high when the response time is reduced even for a small set of items where the examinee has item preknowledge

University of Twente Research Information

A Bayesian approach to person-fit analysis in item response theory models

Author: Glas Cornelis A.W.
Meijer R.R.
Publication venue: University of Twente, Faculty Educational Science and Technology
Publication date: 01/01/2001
Field of study

University of Twente Research Information

Some new methods to detect person fit in CAT

Author: Meijer R.R.
van Krimpen-Stoop Edith
Publication venue: Law School Admission Council
Publication date: 01/03/1999
Field of study

Person fit is concerned with detecting nonfitting item-score patterns. Most person-fit statistics have been proposed in the context of conventionally administered tests or paper-and-pencil (P&P) tests. In this study, we will first review some existing person-fit studies in a computerized adaptive testing (CAT) context and then investigate the usefulness of some new fit statistics that are based on the specific characteristics of a CAT. Both the use of statistical process control and the use of nonparametric tests is explored. The results of a simulation study to detect nonfitting response patterns in a CAT showed that the detection rate of these statistics is comparable to the detection rate of person-fit statistics in P&P tests

University of Twente Research Information

On the consistency of individual classification using short scales

Author: Emons W.H.M.
Meijer R.R.
Sijtsma K.
Publication venue
Publication date: 01/01/2007
Field of study

Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measure-ment error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level, proportions of correct classifications were computed for varying test length, cut-scores, item scoring, and choices of item parameters. Short tests were found to classify at most 50 % of a group consistently. Results were much better for tests containing 20 or 40 items. Small differences were found between dichotomous and polytomous (5 ordered scores) items. It is recommended that short tests for high-stakes decision making be used in combination with other information so as to increase reliability and classification consistency

CiteSeerX

University of Twente Research Information

Tilburg University Repository