A Study of the Robustness of the Three-Parameter Item Response Model

Abstract

Simulation techniques were employed to investigate the use of the three-parameter item response model on psychological test data which violated the model\u27s assumptions of large sample sizes, long tests and test unidimensionality. The accuracy of the person ability and item characteristic curve parameter estimates derived by the three-parameter item response model was evaluated. Data sets and distributions of person ability and item characteristic curve parameters were generated using a computer-based algorithm, AVRAM (Ree, 1980), which employs the three-parameter logistic probability equation described by Birnbaum (1968). A computer software package, LOGIST5 (Wingersky, Burton & Lord, 1982), which utilizes the three-parameter logistic probability equation, was used to derive the parameter estimates for the person response and the item characteristic curves. The present study based its analyses on the unedited person-item data matrix. As such, the findings are somewhat inconsistent with those reported by studies employing an edited data matrix (e.g., Ree, 1979). However, these findings are much more consistent with the types of test situations likely to occur in industrial-organizational research, where the focus of research will be the evaluation of differences in individual and group test scores as opposed to the design and construction of tests. The results showed that the item discrimination, ai, and lower asymptote, ci, parameters of the item characteristic curve were both accurately recovered when small sample sizes and short tests were used, and conditions of item bias existed. The person ability parameter, Bv was also accurately recovered. The recovery of bi, the item difficulty parameter, was most affected. The average absolute differences and root-mean-square errors obtained on bi were extremely large relative to those obtained on ai and ci, as well as, those reported for bi elsewhere in the literature (Ree, 1979). Not only were the individual parameter estimates for bi not accurately recovered, but also, changes in the means of the distributions of bi were observed. When the samples available for item analysis are comprised of less than 2,000 examinees, and conditions of item bias exist, the practitioner of industrial-organizational psychology should consider the following: (1) Use the three-parameter model, but proceed with caution; or (2) adopt an alternative item response model. BICAL (Wright & Mead, 1976), a one-parameter model which employs maximum likelihood procedures, is suggested for sample sizes of 1000 examinees, and PROX (Cohen, 1976), a one-parameter model that uses algebraic procedures, is suggested when samples are comprised of 500 examinees or less

    Similar works