Search CORE

140 research outputs found

Statistical analysis of Q-matrix based diagnostic classification models

Author: Chen Yunxiao
Liu Jingchen
Xu Gongjun
Ying Zhiliang
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2015
Field of study

Diagnostic classification models (DMCs) have recently gained prominence in educational assessment, psychiatric evaluation, and many other disciplines. Central to the model specification is the so-called Q-matrix that provides a qualitative specification of the item-attribute relationship. In this article, we develop theories on the identifiability for the Q-matrix under the DINA and the DINO models. We further propose an estimation procedure for the Q-matrix through the regularized maximum likelihood. The applicability of this procedure is not limited to the DINA or the DINO model and it can be applied to essentially all Q-matrix based DMCs. Simulation studies show that the proposed method admits high probability recovering the true Q-matrix. Furthermore, two case studies are presented. The first case is a dataset on fraction subtraction (educational application) and the second case is a subsample of the National Epidemiological Survey on Alcohol and Related Conditions concerning the social anxiety disorder (psychiatric application)

CiteSeerX

LSE Research Online

PubMed Central

Identifiability of Cognitive Diagnosis Models with Polytomous Responses

Author: Lin Mengqi
Xu Gongjun
Publication venue
Publication date: 07/04/2023
Field of study

Cognitive Diagnosis Models (CDMs) are a powerful statistical and psychometric tool for researchers and practitioners to learn fine-grained diagnostic information about respondents' latent attributes. There has been a growing interest in the use of CDMs for polytomous response data, as more and more items with multiple response options become widely used. Similar to many latent variable models, the identifiability of CDMs is critical for accurate parameter estimation and valid statistical inference. However, the existing identifiability results are primarily focused on binary response models and have not adequately addressed the identifiability of CDMs with polytomous responses. This paper addresses this gap by presenting sufficient and necessary conditions for the identifiability of the widely used DINA model with polytomous responses, with the aim to provide a comprehensive understanding of the identifiability of CDMs with polytomous responses and to inform future research in this field

arXiv.org e-Print Archive

Recommended from our members

Statistical Inference and Experimental Design for Q-matrix Based Cognitive Diagnosis Models

Author: Zhang Stephanie
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

There has been growing interest in recent years in using cognitive diagnosis models for diagnostic measurement, i.e., classification according to multiple discrete latent traits. The Q-matrix, an incidence matrix specifying the presence or absence of a relationship between each item in the assessment and each latent attribute, is central to many of these models. Important applications include educational and psychological testing; demand in education, for example, has been driven by recent focus on skills-based evaluation. However, compared to more traditional models coming from classical test theory and item response theory, cognitive diagnosis models are relatively undeveloped and suffer from several issues limiting their applicability. This thesis exams several issues related to statistical inference and experimental design for Q-matrix based cognitive diagnosis models. We begin by considering one of the main statistical issues affecting the practical use of Q-matrix based cognitive diagnosis models, the identifiability issue. In statistical models, identifiability is prerequisite for most common statistical inferences, including parameter estimation and hypothesis testing. With Q-matrix based cognitive diagnosis models, identifiability also affects the classification of respondents according to their latent traits. We begin by examining the identifiability of model parameters, presenting necessary and sufficient conditions for identifiability in several settings. Depending on the area of application and the researcher's degree of control over the experiment design, fulfilling these identifiability conditions may be difficult. The second part of this thesis proposes new methods for parameter estimation and respondent classification for use with non-identifiable models. In addition, our framework allows consistent estimation of the severity of the non-identifiability problem, in terms of the proportion of the population affected by it. The implications of this measure for the design of diagnostic assessments are also discussed

Columbia University Academic Commons

Statistical Analysis of Structured Latent Attribute Models

Author: Gu Yuqi
Publication venue
Publication date: 01/01/2020
Field of study

In modern psychological and biomedical research with diagnostic purposes, scientists often formulate the key task as inferring the fine-grained latent information under structural constraints. These structural constraints usually come from the domain experts' prior knowledge or insight. The emerging family of Structured Latent Attribute Models (SLAMs) accommodate these modeling needs and have received substantial attention in psychology, education, and epidemiology. SLAMs bring exciting opportunities and unique challenges. In particular, with high-dimensional discrete latent attributes and structural constraints encoded by a structural matrix, one needs to balance the gain in the model's explanatory power and interpretability, against the difficulty of understanding and handling the complex model structure. This dissertation studies such a family of structured latent attribute models from theoretical, methodological, and computational perspectives. On the theoretical front, we present identifiability results that advance the theoretical knowledge of how the structural matrix influences the estimability of SLAMs. The new identifiability conditions guide real-world practices of designing diagnostic tests and also lay the foundation for drawing valid statistical conclusions. On the methodology side, we propose a statistically consistent penalized likelihood approach to selecting significant latent patterns in the population in high dimensions. Computationally, we develop scalable algorithms to simultaneously recover both the structural matrix and the dependence structure of the latent attributes in ultrahigh dimensional scenarios. These developments explore an exponentially large model space involving many discrete latent variables, and they address the estimation and computation challenges of high-dimensional SLAMs arising from large-scale scientific measurements. The application of the proposed methodology to the data from international educational assessments reveals meaningful knowledge structures of the student population.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155196/1/yuqigu_1.pd

Deep Blue Documents at the University of Michigan

Doctora honoris causa Lisa Randall : discurs llegit a la cerimònia d'investidura celebrada a la sala d'actes de l'edifici Rectorat el dia 25 de març de l'any 2019

Author: Pumarola Suñé Tomàs
Randall Lisa
Publication venue
Publication date: 01/01/2019
Field of study

Diposit Digital de Documents de la UAB

Detecting stochastic dominance for poset-valued random variables as an example of linear programming on closure systems

Author: Augustin Thomas
Jansen Christoph
Schollmeyer Georg
Publication venue
Publication date: 31/08/2017
Field of study

In this paper we develop a linear programming method for detecting stochastic dominance for random variables with values in a partially ordered set (poset) based on the upset-characterization of stochastic dominance. The proposed detection-procedure is based on a descriptively interpretable statistic, namely the maximal probability-difference of an upset. We show how our method is related to the general task of maximizing a linear function on a closure system. Since closure systems are describable via their valid formal implications, we can use here ingredients of formal concept analysis. We also address the question of inference via resampling and via conservative bounds given by the application of Vapnik-Chervonenkis theory, which also allows for an adequate pruning of the envisaged closure system that allows for the regularization of the test statistic (by paying a price of less conceptual rigor). We illustrate the developed methods by applying them to a variety of data examples, concretely to multivariate inequality analysis, item impact and differential item functioning in item response theory and to the analysis of distributional differences in spatial statistics. The power of regularization is illustrated with a data example in the context of cognitive diagnosis models