24 research outputs found
Optimal item pool design for computerized adaptive tests with polytomous items using GPCM
Abstract Computerized adaptive testing (CAT) is a testing procedure with advantages in improving measurement precision and increasing test efficiency. An item pool with optimal characteristics is the foundation for a CAT program to achieve those desirable psychometric features. This study proposed a method to design an optimal item pool for tests with polytomous items using the generalized partial credit model (G-PCM). It extended a method for approximating optimality with polytomous items being described succinctly for the purpose of pool design. Optimal item pools were generated using CAT simulations with and without practical constraints of content balancing and item exposure control. The performances of the item pools were evaluated against an operational item pool. The results indicated that the item pools designed with stratification based on discrimination parameters performed well with an efficient use of the less discriminative items within the target accuracy levels. The implications for developing item pools are also discussed
A unified factor-analytic approach to the detection of item and test bias: Illustration with the effect of providing calculators to students with dyscalculia
The difficulty of test items that measure more than one ability
Many test items require more than one ability to
obtain a correct response. This article proposes a multidimensional
index of item difficulty that can be used
with items of this type. The proposed index describes
multidimensional item difficulty as the direction in the
multidimensional space in which the item provides the
most information and the distance in that direction to
the most informative point. The multidimensional difficulty
is derived for a particular item response theory
model and an example of its application is given using
the ACT Mathematics Usage Test
Test construction in the 1990s: Recent approaches every psychologist should know
ACT The article summarizes the current state of the art in test construction and contrasts it with previous conceptual models, some of which are wrong or misleading. In particular, new methodologies for item selection and review arc presented as well as current thinking on the specification of technical characteristics of tests. The construction and interpretation of psychological tests has been a standard part of the curriculum for psychology majors and students in graduate programs in psychology since the days of James McKeen Catell. However, information on the construction of psychological tests that appears in psychological and educational training curricula and techniques that are put in practice as part of psychological research are often not state of the art, mainly because it takes some time for information to diffuse from the research literature and from the internal documents of testing organizations to the sources used for instruction. This diffusion process is probably no worse in psychology than it is in other areas-consider how long it has taken advances made by the space program to influence everyday life. Automobile technology is quite different now from the way it was 20 years ago as a result of advances in computer technology and materials from the space program, but the changes took quite a long time to be put in place. The purpose of this article is to identify some important changes that have taken place in the process of test development over the past 20 years. The changes are highlighted by contrasting the descriptions of the process from sources spanning the period since the 1930s, starting with a book on intelligence testing hy To provide some structure to the article, the test development process has been divided into four parts. They include descriptions of (a) content, (b) statistical specifications, (c) item selection, and (d) test review processes. Although these topics are not truly mutually exclusive, they do provide a relatively concise structure for discussing test construction practice
Statistics Journal of Educational and Behavioral JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS
Converting Boundaries Between National Assessment Governing Board Performance Categories to Points on the National Assessment of Educational Progress Score Scale: The 1996 Science NAEP Process
The discriminating power of items that measure more than one dimension
Determining a correct response to many test
items frequently requires more than one ability.
This paper describes the characteristics of items of
this type by proposing generalizations of the item
response theory concepts of discrimination and
information. The conceptual framework for these
statistics is presented, and the formulas for the
statistics are derived for the multidimensional
extension of the two-parameter logistic model. Use
of the statistics is demonstrated for a form of the
ACT Mathematics Usage Test. Index terms: item
discrimination, item information, item response
theory, multidimensional item response theory