507 research outputs found
Criterion-referenced measurement: Its main applications, problems and findings
The need for criterion-referenced measurements has mainly arisen from the introduction of instructional programs organized according to modern principles from educational technology. Some of these programs are discussed, and it is indicated for what purposes criterion-referenced measurements are used. Three main problems of criterion-referenced measurement are distinguished: The problem of criterion-referenced scoring and score interpretation, the problem of criterion-referenced item and test analysis, and the problem of mastery testing. For each of these problems a variety of solutions of the paper to provide an overview of these and to introduce the reader to the original literature
Passing score and length of a mastery test
A classical problem in mastery testing is the choice of passing score and test length so that the mastery decisions are optimal. Thsi problem has been addressed several times from a variety of view-points. In this paper the usual indifference zone approach is adopted with a new criterion for optimizing the passing score. It appears that, under the assumption of the binomial error model, this yields a linear relationship between optimal passing score and test length, which subsequently can be used in a simple procedure for optimizing the test length. It is indicated how different losses for both decision errors and a known base rate can be incorporated in the procedure, and how a correction for guessing can be applied. Finally, the results in this paper are related to results obtained in sequential testing and in the latent class approach to mastery testing
Some procedures for computerized ability testing
For computerized test systems to be operational, the use of item response theory is a prerequisite. As opposed to classical test theory, in item response models the abilities of the examinees and the properties of the items are parameterized separately. Hence, when measuring the abilities of examinees, the model implicitly corrects for the item properties, and measurement on an item-independent scale is possible. In addition, item response theory offers the use of test and item information as local reliability indices defined on the ability scale. In this chapter, it is shown how the main features of item response theory have given rise to the development of promising procedures for computerized testing. Among the topics discussed are procedures for item bank calibration, automated test construction, adaptive test administration, generating norm distributions, and diagnosing test scores
Decision models for use with criterion-referenced tests
The problem of mastery decisions and optimizing cutoff scores on criterion-referenced tests is considered. This problem can be formalized as an (empirical) Bayes problem with decisions rules of a monotone shape. Next, the derivation of optimal cutoff scores for threshold, linear, and normal ogive loss functions is addressed, alternately using such psychometric models as the classical model, the beta-binomial, and the bivariate normal model. One important distinction made is between decisions with an internal and an external criterion. A natural solution to the problem of reliability and validity analysis of mastery decisions is to analyze with a standardization of the Bayes risk (coefficient delta). It is indicated how this analysis proceeds and how, in a number of cases, it leads to coefficients already known from classical test theory. Finally, some new lines of research are suggested along with other aspects of criterion-referenced testing that can be approached from a decision-theoretic point of view
Some thoughts on the use of decision theory to set cutoff scores: Comment on de Gruijter and Hambleton
In response to an article by de Gruijter and Hambleton (1984), some thoughts on the use of decision theory for setting cutoff scores on mastery tests are presented. This paper argues that decision theory offers much more than suggested by de Gruijter and Hambleton and that an attempt at evaluating its potentials for mastery testing should address the full scale of possibilities. As for the problems de Gruijter and Hambleton have raised, some of them disappear if proper choices from decision theory are made, while others are inherent in mastery testing and will be encountered by any method of setting cutoff scores. Further, this paper points at the development of new technology to assist the mastery tester in the application of decision theory. From this an optimistic attitude towards the potentials of decision theory for mastery testing is concluded
The use of test scores for classification decisions with threshold utility
The classification problem consists of assigning subjects to one of several available treatments on the basis of their test scores, where the success of each treatment is measured by a different criterion. It is indicated how this problem can be formulated as an (empirical) Bayes decision problem. As an example, the case of classification with a threshold utility function is analyzed, and optimal assignment rules are derived. The results are illustrated empirically with data from a classification problem in which achievement test data are used to assign students to appropriate continuation schools. The classification problem consists of assigning subjects to one of several available treatments on the basis of their test scores, where the success of each treatment is measured by a different criterion. It is indicated how this problem can be formulated as an (empirical) Bayes decision problem. As an example, the case of classification with a threshold utility function is analyzed, and optimal assignment rules are derived. The results are illustrated empirically with data from a classification problem in which achievement test data are used to assign students to appropriate continuation schools
Estimating the parameters of Emrick's mastery testing mode
Emrick’s model is a latent class or state model for mastery testing that entails a simple rule for separating masters from nonmasters with respect to a homogeneous domain of items. His method for estimating the model parameters has only restricted applicability inasmuch as it assumes a mixing parameter equal to .50 and an a priori known ratio of the two latent success probabilities. The maximum likelihood method is also available but yields an intractable system of estimation equations which can only be solved iteratively. The emphasis in this paper is on estimates to be computed by hand but nonetheless accurate enough for most practical situations. It is shown how the method of moments can be used to obtain such "quick and easy" estimates. In addition, an endpoint method is discussed that assumes that the parameters can be estimated from the tails of the sample distribution. A monte carlo experiment demonstrated that for a great variety of parameter values, test lengths, and sample sizes, the method of moments yields excellent results and is uniformly much better than the endpoint method
Binomial test models and item difficulty
In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally understood and differ for both conceptions. When the binomial model is applied to a fixed examinee, the deterministic conception imposes no conditions on item difficulty but requires instead that all items have characteristic functions of the Guttman type. In contrast, the stochastic conception allows non- Guttman items but requires that all characteristic functions must intersect at the same point, which implies equal classically defined difficulty. The beta-binomial model assumes identical characteristic functions for both conceptions, and this also implies equal difficulty. Finally, the compound binomial model entails no restrictions on item difficulty
- …