105,241 research outputs found

    DISTINGUISHING CONTINUOUS AND DISCRETE APPROACHES TO MULTILEVEL MIXTURE IRT MODELS: A MODEL COMPARISON PERSPECTIVE

    Get PDF
    The current study introduced a general modeling framework, multilevel mixture IRT (MMIRT) which detects and describes characteristics of population heterogeneity, while accommodating the hierarchical data structure. In addition to introducing both continuous and discrete approaches to MMIRT, the main focus of the current study was to distinguish continuous and discrete MMIRT models from a model comparison perspective. A simulation study was conducted to evaluate the impact of class separation, cluster size, proportion of mixture, and between-group ability variance on the model performance of a set of MMIRT models. The behavior of information-based fit criteria in distinguishing between discrete and continuous MMIRT models was also investigated. An empirical analysis was presented to illustrate the application of MMIRT models. Results suggested that class separation, and between-group ability variance had significant impact on MMIRT model performance. Discrete MMIRT models with fewer group-level latent classes performed consistently better on parameter and classification recovery than the continuous MMIRT model and the discrete models with more latent classes at the group level. Despite the poor performance of the continuous MMIRT model, it was favored over the discrete models by most fit indices. The AIC, AIC3, AICC, and the modification of AIC and ssBIC were more sensitive to the discreteness in random effect distribution, compared to the CAIC, BIC, their modifications, and ssBIC. The latter ones had a higher tendency to select continuous MMIRT model as the best fitting model, regardless of the true distribution of random effects

    Latent class analysis variable selection

    Get PDF
    We propose a method for selecting variables in latent class analysis, which is the most common model-based clustering method for discrete data. The method assesses a variable's usefulness for clustering by comparing two models, given the clustering variables already selected. In one model the variable contributes information about cluster allocation beyond that contained in the already selected variables, and in the other model it does not. A headlong search algorithm is used to explore the model space and select clustering variables. In simulated datasets we found that the method selected the correct clustering variables, and also led to improvements in classification performance and in accuracy of the choice of the number of classes. In two real datasets, our method discovered the same group structure with fewer variables. In a dataset from the International HapMap Project consisting of 639 single nucleotide polymorphisms (SNPs) from 210 members of different groups, our method discovered the same group structure with a much smaller number of SNP

    Dimensionality reduction of clustered data sets

    Get PDF
    We present a novel probabilistic latent variable model to perform linear dimensionality reduction on data sets which contain clusters. We prove that the maximum likelihood solution of the model is an unsupervised generalisation of linear discriminant analysis. This provides a completely new approach to one of the most established and widely used classification algorithms. The performance of the model is then demonstrated on a number of real and artificial data sets
    • …
    corecore