130 research outputs found

    The crit coefficient in Mokken scale analysis:A simulation study and an application in quality-of-life research

    Get PDF
    PURPOSE: In Mokken scaling, the Crit index was proposed and is sometimes used as evidence (or lack thereof) of violations of some common model assumptions. The main goal of our study was twofold: To make the formulation of the Crit index explicit and accessible, and to investigate its distribution under various measurement conditions. METHODS: We conducted two simulation studies in the context of dichotomously scored item responses. We manipulated the type of assumption violation, the proportion of violating items, sample size, and quality. False positive rates and power to detect assumption violations were our main outcome variables. Furthermore, we used the Crit coefficient in a Mokken scale analysis to a set of responses to the General Health Questionnaire (GHQ-12), a self-administered questionnaire for assessing current mental health. RESULTS: We found that the false positive rates of Crit were close to the nominal rate in most conditions, and that power to detect misfit depended on the sample size, type of violation, and number of assumption-violating items. Overall, in small samples Crit lacked the power to detect misfit, and in larger samples power differed considerably depending on the type of violation and proportion of misfitting items. Furthermore, we also found in our empirical example that even in large samples the Crit index may fail to detect assumption violations. DISCUSSION: Even in large samples, the Crit coefficient showed limited usefulness for detecting moderate and severe violations of monotonicity. Our findings are relevant to researchers and practitioners who use Mokken scaling for scale and questionnaire construction and revision. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11136-021-02924-z

    The Use of Nonparametric Item Response Theory to Explore Data Quality

    Get PDF
    The aim of this chapter is to provide insight into a number of commonly used nonparametric item response theory (NIRT) methods and to show how these methods can be used to describe and explore the psychometric quality of questionnaires used in patient-reported outcome measurement and, more in general, typical performance measurement (personality, mood, health-related constructs). NIRT is an extremely valuable tool for preliminary data analysis and for evaluating whether item response data are acceptable for parametric IRT modeling. This is in particular useful in the field of typical performance measurement where the construct being measured is often very different than in maximum performance measurement (education, intelligence; see Chapter 1 of this handbook). Our basic premise is that there are no “best tools” or “best models” and that the usefulness of psychometric modeling depends on the specific aims of the instrument (questionnaire, test) that is being used. Most important is, however, that it should be clear for a researcher how sensitive a specific method (for example, DETECT, or Mokken scaling) is to the assumptions that are being investigated. The NIRT literature is not always clear about this, and in this chapter we try to clarify some of these ambiguities

    On the Practical Consequences of Misfit in Mokken Scaling

    Get PDF
    Mokken scale analysis is a popular method to evaluate the psychometric quality of clinical and personality questionnaires and their individual items. Although many empirical papers report on the extent to which sets of items form Mokken scales, there is less attention for the effect of violations of commonly used rules of thumb. In this study, the authors investigated the practical consequences of retaining or removing items with psychometric properties that do not comply with these rules of thumb. Using simulated data, they concluded that items with low scalability had some influence on the reliability of test scores, person ordering and selection, and criterion-related validity estimates. Removing the misfitting items from the scale had, in general, a small effect on the outcomes. Although important outcome variables were fairly robust against scale violations in some conditions, authors conclude that researchers should not rely exclusively on algorithms allowing automatic selection of items. In particular, content validity must be taken into account to build sensible psychometric instruments

    Admission testing for higher education:A multi-cohort study on the validity of high-fidelity curriculum-sampling tests

    Get PDF
    <div><p>We investigated the validity of curriculum-sampling tests for admission to higher education in two studies. Curriculum-sampling tests mimic representative parts of an academic program to predict future academic achievement. In the first study, we investigated the predictive validity of a curriculum-sampling test for first year academic achievement across three cohorts of undergraduate psychology applicants and for academic achievement after three years in one cohort. We also studied the relationship between the test scores and enrollment decisions. In the second study, we examined the cognitive and noncognitive construct saturation of curriculum-sampling tests in a sample of psychology students. The curriculum-sampling tests showed high predictive validity for first year and third year academic achievement, mostly comparable to the predictive validity of high school GPA. In addition, curriculum-sampling test scores showed incremental validity over high school GPA. Applicants who scored low on the curriculum-sampling tests decided not to enroll in the program more often, indicating that curriculum-sampling admission tests may also promote self-selection. Contrary to expectations, the curriculum-sampling tests scores did not show any relationships with cognitive ability, but there were some indications for noncognitive saturation, mostly for perceived test competence. So, curriculum-sampling tests can serve as efficient admission tests that yield high predictive validity. Furthermore, when self-selection or student-program fit are major objectives of admission procedures, curriculum-sampling test may be preferred over or may be used in addition to high school GPA.</p></div
    • …
    corecore