96 research outputs found

    DISTINGUISHING CONTINUOUS AND DISCRETE APPROACHES TO MULTILEVEL MIXTURE IRT MODELS: A MODEL COMPARISON PERSPECTIVE

    Get PDF
    The current study introduced a general modeling framework, multilevel mixture IRT (MMIRT) which detects and describes characteristics of population heterogeneity, while accommodating the hierarchical data structure. In addition to introducing both continuous and discrete approaches to MMIRT, the main focus of the current study was to distinguish continuous and discrete MMIRT models from a model comparison perspective. A simulation study was conducted to evaluate the impact of class separation, cluster size, proportion of mixture, and between-group ability variance on the model performance of a set of MMIRT models. The behavior of information-based fit criteria in distinguishing between discrete and continuous MMIRT models was also investigated. An empirical analysis was presented to illustrate the application of MMIRT models. Results suggested that class separation, and between-group ability variance had significant impact on MMIRT model performance. Discrete MMIRT models with fewer group-level latent classes performed consistently better on parameter and classification recovery than the continuous MMIRT model and the discrete models with more latent classes at the group level. Despite the poor performance of the continuous MMIRT model, it was favored over the discrete models by most fit indices. The AIC, AIC3, AICC, and the modification of AIC and ssBIC were more sensitive to the discreteness in random effect distribution, compared to the CAIC, BIC, their modifications, and ssBIC. The latter ones had a higher tendency to select continuous MMIRT model as the best fitting model, regardless of the true distribution of random effects

    Nonparametric diagnostic classification analysis for testlet based tests

    Get PDF
    Diagnostic classification Diagnostic Classification Models (DCMs) are multidimensional confirmatory latent class models that can classify individuals into different classes based on their attribute mastery profiles. While DCMs represent the more prevalent parametric approach to diagnostic classification analysis, the Hamming distance method, a newly developed nonparametric diagnostic classification method, is quite promising in that it does not require fitting a statistical model and is less demanding on sample size. However, both parametric and nonparametric approach have assumptions of local item independency, which is often violated by testlet based tests. This study proposed a conditional-correlation based nonparametric approach to assess testlet effect and a set of testlet Hamming distance methods to account for the testlet effects in classification analyses. Simulation studies were conducted to evaluate the proposed methods. In the conditional-correlation approach, the testlet effects were computed as the average item-pair correlations within the same testlet by conditioning on attribute profiles. The inverse of the testlet effect was then used in testlet Hamming distance method to weight the Hamming distances for that particular testlet. Simulation studies were conducted to evaluate the proposed methods in conditions with varying sample size, testlet effect size, testlet size, balance of testlet size, and balance of testlet effect size. Although the conditional-correlation based approach often underestimated true testlet effect sizes, it was still able to detect the relative size of different testlet effects. The developed testlet Hamming distance methods seem to be an improvement over the estimation methods that ignore testlet effects because they provided slightly higher classification accuracy where large testlet effects were present. In addition, Hamming distance method and maximum likelihood estimation are robust to local item dependency caused by low to moderate testlet effects. Recommendations for practitioners and study limitations were provided

    A two-step estimator for multilevel latent class analysis with covariates

    Full text link
    We propose a two-step estimator for multilevel latent class analysis (LCA) with covariates. The measurement model for observed items is estimated in its first step, and in the second step covariates are added in the model, keeping the measurement model parameters fixed. We discuss model identification, and derive an Expectation Maximization algorithm for efficient implementation of the estimator. By means of an extensive simulation study we show that (i) this approach performs similarly to existing stepwise estimators for multilevel LCA but with much reduced computing time, and (ii) it yields approximately unbiased parameter estimates with a negligible loss of efficiency compared to the one-step estimator. The proposal is illustrated with a cross-national analysis of predictors of citizenship norms.Comment: Manuscript version accepted for publication in Psychometrik

    Teacher involvement in the development of confidential assessment materials. Consultation

    Get PDF

    From OLS to Multilevel Multidimensional Mixture IRT: A Model Refinement Approach to Investigating Patterns of Relationships in PISA 2012 Data

    Get PDF
    Thesis advisor: Henry I. BraunSecondary analyses of international large-scale assessments (ILSA) commonly characterize relationships between variables of interest using correlations. However, the accuracy of correlation estimates is impaired by artefacts such as measurement error and clustering. Despite advancements in methodology, conventional correlation estimates or statistical models not addressing this problem are still commonly used when analyzing ILSA data. This dissertation examines the impact of both the clustered nature of the data and heterogeneous measurement error on the correlations reported between background data and proficiency scales across countries participating in ILSA. In this regard, the operating characteristics of competing modeling techniques are explored by means of applications to data from PISA 2012. Specifically, the estimates of correlations between math self-efficacy and math achievement across countries are the principal focus of this study. Sequentially employing four different statistical techniques, a step-wise model refinement approach is used. After each step, the changes in the within-country correlation estimates are examined in relation to (i) the heterogeneity of distributions, (ii) the amount of measurement error, (iii) the degree of clustering, and (iv) country-level math performance. The results show that correlation estimates gathered from two-dimensional IRT models are more similar across countries in comparison to conventional and multilevel linear modeling estimates. The strength of the relationship between math proficiency and math self-efficacy is moderated by country mean math proficiency and this was found to be consistent across all four models even when measurement error and clustering were taken into account. Multilevel multidimensional mixture IRT modeling results support the hypothesis that low-performing groups within countries have a lower correlation between math self-efficacy and math proficiency. A weaker association between math self-efficacy and math proficiency in lower achieving groups is consistently seen across countries. A multilevel mixture IRT modeling approach sheds light on how this pattern emerges from greater randomness in the responses of lower performing groups. The findings from this study demonstrate that advanced modeling techniques not only are more appropriate given the characteristics of the data, but also provide greater insight about the patterns of relationships across countries.Thesis (PhD) — Boston College, 2021.Submitted to: Boston College. Lynch School of Education.Discipline: Educational Research, Measurement and Evaluation

    Bayesian Estimation of Mixture IRT Models using NUTS

    Get PDF
    The No-U-Turn Sampler (NUTS) is a relatively new Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior that common MCMC algorithms such as Gibbs sampling or Metropolis Hastings usually exhibit. Given the fact that NUTS can efficiently explore the entire space of the target distribution, the sampler converges to high-dimensional target distributions more quickly than other MCMC algorithms and is hence less computational expensive. The focus of this study is on applying NUTS to one of the complex IRT models, specifically the two-parameter mixture IRT (Mix2PL) model, and further to examine its performance in estimating model parameters when sample size, test length, and number of latent classes are manipulated. The results indicate that overall, NUTS performs well in recovering model parameters. However, the recovery of the class membership of individual persons is not satisfactory for the three-class conditions. Also, the results indicate that WAIC performs better than LOO in recovering the number of latent classes, in terms of the proportion of the time the correct model was selected as the best fitting model. However, when the effective number of parameters was also considered in selecting the best fitting model, both fully Bayesian fit indices perform equally well. In addition, the results suggest that when multiple latent classes exist, using either fully Bayesian fit indices (WAIC or LOO) would not select the conventional IRT model. On the other hand, when all examinees came from a single unified population, fitting MixIRT models using NUTS causes problems in convergence

    A multilevel latent Markov model for the evaluation of nursing homes' performance

    Get PDF
    The periodic evaluation of health care services is a primary concern for many institutions. In this work, we focus on nursing home services with the aim to produce a ranking of a set of nursing homes based on their capability to improve - or at least to keep unchanged - the health status of the patients they host. As the overall health status is not directly observable, latent variable models represent a suitable approach. Moreover, given the longitudinal and multilevel structure of the available data, we rely on a multilevel latent Markov model where patients and nursing homes are the first and the second level units, respectively. The model includes individual covariates to account for the patient case-mix and the impact of nursing home membership is modeled through a pair of correlated random effects affecting the initial distribution and the transition probabilities between different levels of health status. Through the prediction of these random effects we obtain a ranking of the nursing homes. Furthermore, the proposed model is designed to address non-ignorable dropout, which typically occurs in these contexts because some elderly patients die before completing the survey. We apply our model to the Long Term Care Facilities dataset, a longitudinal dataset gathered from Regione Umbria (Italy). Our results are robust to the sensitivity parameter involved (the number of latent states) and show that differences in nursing homes' performances are statistically significant. The authors certify that they have the right to deposit this contribution in its published format with MPRA

    Bayesian estimation of latent trait distributions considering hierarchical structures and partially missing covariate data

    Get PDF
    Large-scale studies in social sciences often involve the measurement of latent constructs and seek to investigate their relationship with additional variables in subsequent analyses. Within this context the analyst has to face three problems: First, there is uncertainty through the particular indicators which measure the trait of interest. Second, large-scale studies typically exhibit hierarchical structures caused by sampling design or a composite population consisting of clustered observations. Third, uncertainty arises due to the presence of missing values in covariates related to the latent construct. This thesis provides a Bayesian estimation strategy that simultaneously addresses all three issues. I start out with the class of latent regression item response models, which combine the fields of measurement models and structural analysis, and develop a novel algorithm based on the device of data augmentation. Binary and ordered polytomous items can both be included in the analysis. Population heterogeneity is taken into account either through multigroup, finite mixture or random intercept specifications. Sampling from the posterior distribution of parameters is enriched by sampling from the full conditional distributions of missing values in person covariates. Approximations for the distributions of missing values are constructed from classification and regression trees, thus allowing for high flexibility in the incorporation of metric as well as categorical variables and nonlinear relationships. The validity of the proposed strategy is evaluated with respect to statistical accuracy by two simulation studies controlling the missing data generating mechanism. I show that the novel algorithm is capable of recovering all involved parameters in each of the two scenarios and clearly outperforms stochastic regression imputation and complete cases analysis. Two illustrations using data from the National Educational Panel Study on mathematical abilities and eating disorders of ninth grade students demonstrate the empirical usefulness of the method. Finally, I introduce an R package which implements the estimation routines presented in the thesis
    • …
    corecore