45 research outputs found

    Coefficients for tests from a decision theoretic point of view

    Get PDF
    From a decision theoretic point of view a general coefficient for tests, d, is derived. The coefficient is applied to three kinds of decision situations. First, the situation is considered in which a true score is estimated by a function of the observed score of a subject on a test (point estimation). Using the squared error loss function and Kelley’s formula for estimating the true score, it is shown that d equals the reliability coefficient from classical test theory. Second, the situation is considered in which the observed scores are split into more than two categories and different decisions are made for the categories (multiple decision). The general form of the coefficient is derived, and two loss functions suited to multiple decision situations are described. It is shown that for the loss function specifying constant losses for the various combinations of categories on the true and on the observed scores, the coefficient can be computed under the assumptions of the beta-binomial model. Third, the situation is considered in which the observed scores are split into only two categories and different decisions are made for each category (dichotomous decisions). Using a loss function that specifies constant losses for combinations of categories on the true and observed score and the assumption of an increasing regression function of t on x, it is shown that coefficient d equals Loevinger’s coefficient H between true and observed scores. The coefficient can be computed under the assumption of the beta-binomial model. Finally, it is shown that for a linear loss function and Kelley’s formula for the regression of the true score on the observed score, the coefficient equals the reliability coefficient of classical test theory

    CAT for Personality Items Zeitschrift fĂĽr Psychologie

    Get PDF
    Abstract. A computerized adaptive testing (CAT) procedure was simulated with ordinal polytomous personality data collected using a conventional paper-and-pencil testing format. An adapted Dutch version of the dominance scale of Gough and Heilbrun's Adjective Check List (ACL) was used. This version contained Likert response scales with five categories. Item parameters were estimated using Samejima's graded response model from the responses of 1,925 subjects. The CAT procedure was simulated using the responses of 1,517 other subjects. The value of the required standard error in the stopping rule of the CAT was manipulated. The relationship between CAT latent trait estimates and estimates based on all dominance items was studied. Additionally, the pattern of relationships between the CAT latent trait estimates and the other ACL scales was compared to that between latent trait estimates based on the entire item pool and the other ACL scales. The CAT procedure resulted in latent trait estimates qualitatively equivalent to latent trait estimates based on all items, while a substantial reduction of the number of used items could be realized (at the stopping rule of 0.4 about 33% of the 36 items was used)

    Educational measurement

    No full text
    The third edition of the volume Educational Measurement gives, as the previous two editions of Lindquist (1951) and Thorndike (1971), a comprehensive review of the state of art of educational measurement. The volume is edited and introduced by R.L. Linn and is organized in three parts:(1) Theory and General Principles (chapters 2 through 7), (2) Construction, Administration, and Scoring (chapters 8 through 11), and (3) Applications (chapters 12 through 18). More than half of the number of pages is devoted to theory and general principles and the emphasis of the review is also on this par

    Conceptual notes on models for discrete polytomous item responses

    No full text
    The following types of discrete item responses are distinguished : nominal-dichotomous, ordinal-dichotomous, nominal-polytomous, and ordinal-polytomous. Bock (1972) presented a model for nominal-polytomous item responses that, when applied to dichotomous responses, yields Birnbaum’s (1968) two-parameter logistic model. Applying Bock’s model to ordinal-polytomous items leads to a conceptual problem. The ordinal nature of the response variable must be preserved; this can be achieved using three different methods. A number of existing models are derived using these three methods. The structure of these models is similar, but they differ in the interpretation and qualities of their parameters. Information, parameter invariance, log-odds differences invariance, and model violation also are discussed. Information and parameter invariance of dichotomous item response theory (IRT) also apply to polytomous IRT. Specific objectivity of the Rasch model for dichotomous items is a special case of log-odds differences invariance of polytomous items. Differential item functioning of dichotomous IRT is a special case of measurement model violation of polytomous IRT. Index terms: adjacent categories, continuation ratios, cumulative probabilities, differential item functioning, log-odds differences invariance, measurement model violation, parameter invariance, polytomous IRT models

    Why Psychometrics is Not Pathological

    No full text

    Optimal cutting scores using a linear loss function

    Get PDF
    The situation is considered in which a total score on a test is used for classifying examinees into two categories: "accepted (with scores above a cutting score on the test) and "not accepted" (with scores below the cutting score). A value on the latent variable is fixed in advance; examinees above this value are "suitable" and those below are "not suitable." Using a linear loss function, a procedure is described for computing a cutting score that minimizes the risk for the decision rule. The procedure is demonstrated with a criterion-referenced achievement test of elementary statistics administered to 167 students

    Optimal cutting scores using a linear loss function

    No full text
    The situation is considered in which a total score on a test is used for classifying examinees into two categories: "accepted (with scores above a cutting score on the test) and "not accepted" (with scores below the cutting score). A value on the latent variable is fixed in advance; examinees above this value are "suitable" and those below are "not suitable." Using a linear loss function, a procedure is described for computing a cutting score that minimizes the risk for the decision rule. The procedure is demonstrated with a criterion-referenced achievement test of elementary statistics administered to 167 students

    The internal and external optimality of decisions based on tests

    Get PDF
    In applied measurement, test scores are usually transformed to decisions. Analogous to classical test theory, the reliability of decisions has been defined as the consistency of decisions on a test and a retest or on two parallel tests. Coefficient kappa (Cohen, 1960) is used for assessing the consistency of decisions. This coefficient has been developed for assessing agreement between nominal scales. It is argued that the coefficient is not suited for assessing consistency of decisions. Moreover, it is argued that the concept consistency of decisions is not appropriate for assessing the quality of a decision procedure. It is proposed that the concept consistency of decisions be replaced by the concept optimality of the decision procedure. Two types of optimality are distinguished. The internal optimality is the risk of the decision procedure with respect to the true score the test is measuring. The external optimality is the risk of the decision procedure with respect to an external criterion. For assessing the optimality of a decision procedure, coefficient delta (van der Linden & Mellenbergh, 1978), which can be considered a standardization of the Bayes risk or expected loss, can be used. Two loss functions are dealt with: the threshold and the linear loss functions. Assuming psychometric theory, coefficient delta for internal optimality can be computed from empirical data for both the threshold and the linear loss functions. The computation of coefficient delta for external optimality needs no assumption of psychometric theory. For six tests coefficient delta as an index for internal optimality is computed for both loss functions; the results are compared with coefficient kappa for assessing the consistency of decisions with the same tests

    The Theoretical Status of Latent Variables

    No full text
    This article examines the theoretical status of latent variables as used in modern test theory models. First, it is argued that a consistent interpretation of such models requires a realist ontology for latent variables. Second, the relation between latent variables and their indicators is discussed. It is maintained that this relation can be interpreted as a causal one but that in measurement models for interindividual differences the relation does not apply to the level of the individual person. To substantiate intraindividual causal conclusions, one must explicitly represent individual level processes in the measurement model. Several research strategies that may be useful in this respect are discussed, and a typology of constructs is proposed on the basis of this analysis. The need to link individual processes to latent variable models for interindividual differences is emphasized. Consider the following sentence: “Einstein would not have been able to come up with his e � mc 2 had he not possessed such an extraordinary intelligence. ” What does this sentence express? It relates observable behavior (Einstein’s writing e � mc 2)toan unobservable attribute (his extraordinary intelligence), and it does so by assigning to the unobservable attribute a causal role i
    corecore