The maximum Fisher information procedure (F) is a commonly used algorithm for item selection in computerized adaptive testing. This approach leads to great enhancement in test efficiency, yet results in very unbalanced item usage. The a-stratified multistage CAT (STR) was developed to remedy the item usage problem with F, and was found to effectively balance item usage but yielded lower test efficiency. To address the efficiency loss, a refined stratification procedure has been proposed that allows more items to be selected from the high a strata and fewer items from the low a strata (USTR). This study evaluated and compared the three item selection procedures, F, STR, and USTR, along with completely random item selection (RAN) with respect to test efficiency and item usage through CATs simulated under nine test conditions. The nine conditions resulted from combinations of three levels of practical constraints (no constraints, only exposure control, exposure control and content balancing), and three cases of the item selection space arising from combinations of various test lengths and target maximum exposure rates. The various item selection procedures were used to simulate CATs for a
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.