7 research outputs found
Recommended from our members
Evaluating the effects of several multi -stage testing design variables on selected psychometric outcomes for certification and licensure assessment
Computer-based testing is becoming popular with credentialing agencies because new test designs are possible and the evidence is clear that these new designs can increase the reliability and validity of candidate scores and pass/fail decisions. Research on MST to date suggests that the measurement quality of MST results is comparable to full-fledged computer-adaptive tests and improved over computerized fixed-form tests. MST\u27s promise dwells in this potential for improved measurement with greater control than other adaptive approaches for constructing test forms. Recommending use of the MST design and advising how best to set up the design, however, are two different things. The purpose of the current simulation study was to advance an established line of research on MST methodology by enhancing understanding of how several important design variables affect outcomes for high-stakes credentialing. Modeling of the item bank, the candidate population, and the statistical characteristics of test items reflect an operational credentialing exam\u27s conditions. Studied variables were module arrangement (4 designs), amount of overall test information (4 levels), distribution of information over stages (2 variations), strategies for between-stage routing (4 levels), and pass rates (3 levels), for 384 conditions total. Results showed that high levels of decision accuracy (DA) and decision consistency (DC) were consistently observed, even when test information was reduced by as much as 25%. No differences due to the choice of module arrangement were found. With high overall test information, results were optimal when test information was divided equally among stages; with reduced test information gathering more test information at Stage 1 provided the best results. Generalizing simulation study findings is always problematic. In practice, psychometric models never completely explain candidate performance, and with MST, there is always the potential psychological impact on candidates if test difficulty shifts are noticed. At the same time, two findings seem to stand out in this research: (1) with limited amounts of overall test information, it may be best to capitalize on available information with accurate branching decisions early, and (2) there may be little statistical advantage in exceeding test information much above 10 as gains in reliability and validity appear minimal
Effects of Selected Multi-Stage Test Design Alternatives on Credentialing Examination Outcomes 1,2
Abstract One of the main concerns in designing credentialing examinations is the psychometric quality of the pass-fail decisions. The computerized multi-stage test design has been shown in previous research to provide accurate results relative to other computerized test designs in addition to possessing certain operational advantages favored by candidates (e.g., item review). The purpose of the current research was to investigate how selected design variables in multi-stage testing interact with one another to impact on the psychometric quality of pass-fail decisions and ability estimation. The four design variables studied in a computer simulation were (1) module arrangement (4 designs), (2) amount of overall test information (4 levels), (3) distribution of information over stages (2 variations), and (4) strategies for between-stage routing (4 levels), for a total of 128 conditions. Very large sample sizes were used to essentially eliminate the influence of sampling error on the findings. Many of the results were as expected-with small to negligible effects, but two of the findings seemed especially important because of their consequences for practice: (1) with limited amounts of overall test information, it appears best to distribute available information unevenly with more of it going to stage one to improve the accuracy of classifications, and (2) there appeared to be little advantage of exceeding test information much above 10 since the gains in psychometric quality were very small-this finding has implications for the selection of test lengths and/or more effective uses of an item bank
No Adult Left Behind, Either: Creating Large-Scale Computer-Based Tests for Adult Basic Education Students 1
Abstract Testing to chart student performance and to hold schools and districts accountable is everywhere in K-12 school systems in the United States today. These accountability and student assessment demands are equally strong in adult basic education (ABE), even though such assessments receive relatively less attention from policy makers and psychometricians. In Massachusetts, efforts to accurately and validly measure what adult learners know and can do has led to the process of creating tests that 1) are aligned to the state's ABE curriculum frameworks (i.e., criterion-referenced), 2) contain content appropriate for adult learners, and 3) use computer-based testing technology to tailor the tests to adult learning levels and to provide accurate measurement throughout a large achievement continuum. The vision underlying the development of these new tests centers on recruitment and involvement of ABE teachers and administrators throughout the state. This strategy enables teachers and others to take some ownership of the testing program and formalizes the link between assessment and instruction. The purpose of this paper is to describe aspects of the process of developing computerized tests from both psychometric and practical perspectives, with particular discussion of efforts to ensure teacher involvement, curricular relevance, and measurement precision using an innovative computer-based testing design
or Offices within it. NCEO Core Staff
All rights reserved. Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as: Albus, D. A., & Thurlow, M. L. (2007). English language learners with disabilities in state English language proficiency assessments: A review of state accommodatio