268 research outputs found

    Contribuições ao estudo de dados longitudinais na teoria de resposta ao item

    Get PDF
    Orientador: Caio Lucidius Naberezny AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação CientíficaResumo: Na presente tese desenvolvemos classes de modelos longitudinais da Teoria de Resposta o Item (TRI) considerando duas abordagens. A primeira é baseada na decomposição de Cholesky de matrizes de covariância de interesse, relacionadas aos traços latentes. Essa metodologia permite representar um amplo conjunto de estruturas de dependência de maneira relativamente simples, facilita a escolha de distribuições a priori para os parâmetros relacionados à estrutura de dependência, facilita a implementação de algoritmos de estimação (particularmente sob o enfoque Bayesiano), permite considerar diferentes distribuições (multivariadas) para os traços latentes de modo simples, torna bastante fácil a incorporação de estruturas de regressão para os traços latentes, entre outras vantagens. Desenvolvemos, adicionalmente, uma classe de modelos com estruturas de curvas de crescimento para os traços latentes. Na segunda abordagem utilizamos cópulas Gaussianas para representar a estrutura de dependência dos traços latentes. Diferentemente da abordagem anterior, essa metodologia permite o total controle das respectivas distribuições marginais mas, igualmente, permite considerar um grande número de estruturas de dependência. Utilizamos modelos dicotômicos de resposta ao item e exploramos a utilização da distribuição normal e normal assimétrica para os traços latentes. Consideramos indivíduos acompanhados ao longo de várias condições de avaliação, submetidos a instrumentos de medida em cada uma delas, os quais possuem alguma estrutura de itens comuns. Exploramos os casos de um único e de vários grupos como também dados balanceados e desbalanceados, no sentido de considerarmos inclusão e exclusão de indivíduos ao longo do tempo. Algoritmos de estimação, ferramentas para verificação da qualidade de ajuste e comparação de modelos foram desenvolvidos sob o paradigma bayesiano, através de algoritmos MCMC híbridos, nos quais os algoritmos SVE (Single Variable Exchange) e Metropolis-Hastings são considerados quando as distribuições condicionais completas não são conhecidas. Estudos de simulação foram conduzidos, os quais indicaram que os parâmetros foram bem recuperados. Além disso, dois conjuntos de dados longitudinais psicométricos foram analisados para ilustrar as metodologias desenvolvidas. O primeiro é parte de um estudo de avaliação educacional em larga escala promovido pelo governo federal brasileiro. O segundo foi extraído do Amsterdam Growth and Health Longitudinal Study (AGHLS) que monitora a saúde e o estilo de vida de adolescentes holandesesAbstract: In this thesis we developed families of longitudinal Item Response Theory (IRT) models considering two approaches. The first one is based on the Cholesky decomposition of the covariance matrices of interest, related to the latent traits. This modeling can accommodate several dependence structures in a easy way, it facilitates the choice of prior distributions for the parameters of the dependence matrix, it facilitates the implementation of estimation algorithms (particularly under the Bayesian paradigm), it allows to consider different (multivariate) distributions for the latent traits, it makes easier the inclusion of regression and multilevel structures for the latent traits, among other advantages. Additionally, we developed growth curve models for the latent traits. The second one uses a Gaussian copula function to describes the latent trait structure. Differently from the first one, the copula approach allows the entire control of the respective marginal latent trait distributions, but as the first one, it accommodates several dependence structures. We focus on dichotomous responses and explore the use of the normal and skew-normal distributions for the latent traits. We consider subjects followed over several evaluation conditions (time-points) submitted to measurement instruments which have some structure of common items. Such subjects can belong to a single or multiple independent groups and also we considered both balanced and unbalanced data, in the sense that inclusion or dropouts of subjects are allowed. Estimation algorithms, model fit assessment and model comparison tools were developed under the Bayesian paradigm through hybrid MCMC algorithms, such that when the full conditionals are not known, the SVE (Single Variable Exchange) and Metropolis-Hastings algorithms are used. Simulation studies indicate that the parameters are well recovered. Furthermore, two longitudinal psychometrical data sets were analyzed to illustrate our methodologies. The first one is a large-scale longitudinal educational study conducted by the Brazilian federal government. The second was extracted from the Amsterdam Growth and Health Longitudinal Study (AGHLS), which monitors the health and life-style of Dutch teenagersDoutoradoEstatisticaDoutor em Estatística162562/2014-4,142486/2015-9CNPQCAPE

    Examination of Parameter Estimation Using Recursive Bayesian Analysis in Simulated Item Response Theory Applications

    Get PDF
    Examination of Parameter Estimation Using Recursive Bayesian Analysis in Simulated Item Response Theory Applications by Robert Hendrick For the past several years, high-stakes testing has been the predominant indicator used to assess students\u27 academic ability. School systems, teachers, parents, and students are dependent upon the accuracy of academic ability estimates designated, θs, by item response theory (IRT) computer programs. In this study, the accuracy of 3 parameter logistic (3PL) IRT estimates of academic ability were obtained from the BILOG-MG and WinBUGS computer programs which were employed to compare the use of non-informative and informative priors in θ estimation. The rationale for comparing the output of these two computer programs is that the underlying statistical theory employed in these two computer programs is different, and there may be a notable difference in the accuracy of θ estimation when an informative prior is used by WinBUGS in analyzing skewed populations. In particular, the θ parameter estimates of BILOG-MG using traditional IRT analysis with non-informative priors in each situation and the θ parameter estimates of WinBUGS using Recursive Bayesian Analysis (RBA) with informative priors are compared to the true simulated θ value using Root Mean Square Errors (RMSEs). To make this comparison, Monte Carlo computer simulation is used across three occasions within three conditions giving nine comparison situations. For the priors and data generated, results show similar θ estimation accuracy for a normally distributed latent trait (RMSE = 0.35), a more accurate θ estimation process using RBA compared to traditional analysis (RMSEs of 0.36 compared to 0.76) when using latent trait distributions skewed in a similar direction, and less accurate θ estimation using RBA compared to traditional analysis (RMSEs of 1.48 compared to 0.80) when using extremely skewed negative then positive distributions in a longitudinal setting. Implications for further research include extensions to other IRT models, developing prior elicitation equations, and applying Bayesian informative prior elicitation in BILOG-MG

    Estimating Bias in Multilvel Reliability Coefficients: A Monte Carlo Simulation

    Get PDF
    Purpose: The purpose of this dissertation was to generate observed scores under complex data conditions often found in the real world and (a) investigate error in terms of internal consistency reliability within the Classical Test Theory framework (Cronbach’s a and polychoric ordinal a) and person reliability within Rasch Rating Scale Model (RSM); (b) inform applied researchers about possible relative bias in reliability coefficients when more complex data structures and underlying distributions are encountered; and (c) provide applied researchers a reference from which to interpret their results. Methods: Using Monte Carlo simulation techniques to generate polytomous response choices in single-level and multilevel models, sample reliability coefficients, standard errors of reliability estimates, and levels of absolute relative bias were examined and compared across a range of data conditions, including normal, mixed, and nonnormal distributions and varying sample sizes. Results: The results support taking the structure of the data collected into account during the analytic phase and provide empirical evidence that if data collected for research are dependent on a higher order structure, reliability coefficients in a multilevel model are less biased than those derived from a single-level model. Additionally, results support the idea that polychoric ordinal a at level-1 of a two-level sampling design have slightly less bias across all data conditions than Cronbach’s a, and under normal and mixed data distributions for person reliability; however, the small gain in the precision of reliability estimates may not be worth the additional effort of calculating polychoric ordinal a for many clinicians and educators. Recommendations for Applied Researchers: Using Cronbach’s a under normal and mixed data conditions and across sample sizes is acceptable and easier to estimate due to its availability in social science software. For extremely non-normal data, the Rasch- RSM model should be used since the effort is worth the lower level of bias. The results also show that a variety of different data properties jointly affect reliability coefficients and care should be taken to provide both context and a theoretical framework in which to interpret results. Keywords: Reliability, Cronbach’s a, polychoric ordinal a, multilevel models, multilevel confirmatory factor analysis, Rasch item response theory, rating scale mode

    Complex Latent Variable Modeling in Educational Assessment

    Get PDF
    Bayesian item response theory models have been widely used in different research fields. They support measuring constructs and modeling relationships between constructs, while accounting for complex test situations (e.g., complex sampling designs, missing data, heterogenous population). Advantages of this flexible modeling framework together with powerful simulation-based estimation techniques are discussed. Furthermore, it is shown how the Bayes factor can be used to test relevant hypotheses in assessment using the College Basic Academic Subjects Examination (CBASE) data

    Bayesian model criticism: prior sensitivity of the posterior predictive checks method

    Get PDF
    Use of noninformative priors with the Posterior Predictive Checks (PPC) method requires more attention. Previous research of the PPC has treated noninformative priors as always noninformative in relation to the likelihood, regardless of model-data fit. However, as model-data fit deteriorates, and the steepness of the likelihood's curvature diminishes, the prior can become more informative than initially intended. The objective of this dissertation was to investigate whether specification of the prior distribution has an effect on the conclusions drawn from the PPC method. Findings indicated that the choice of discrepancy measure is an important factor in the overall success of the method, and that different discrepancy measures are affected more than others by prior specification

    Essays on Latent Variable Models and Roll Call Scaling

    Full text link
    This dissertation comprises three essays on latent variable models and Bayesian statistical methods for the study of American legislative institutions and the more general problems of measurement and model comparison. In the first paper, I explore the dimensionality of latent variables in the context of roll call scaling. The dimensionality of ideal points is an aspect of roll call scaling which has received significant attention due to its impact on both substantive and spatial interpretations of estimates. I find that previous evidence for unidimensional ideal points is a product of the Scree procedure. I propose a new varying dimensions model of legislative voting and a corresponding Bayesian nonparametric estimation procedure (BPIRT) that allows for probabilistic inference on the number of dimensions. Using this approach, I show that there is strong evidence for multidimensional ideal points in the U.S. Congress and that using only a single dimension misses much of the disagreement that occurs within parties. I reexamine theories of U.S. legislative voting and find that empirical evidence for these models is conditional on unidimensionality. In the second paper, I expand on the varying dimensions model of legislative voting and explore the role of group dependencies in legislative voting. Assumptions about independence of observations in the scaling model ignore the possibility that members of the voting body have shared incentives to vote as a group and lead to problems in estimating ideal points and corresponding latent dimensions. I propose a new ideal point model, clustered beta process IRT (C-BPIRT), that explicitly allows for group contributions in the underlying spatial model of voting. I derive a corresponding empirical model that uses flexible Bayesian nonparametric priors to estimate group effects in ideal points and the corresponding dimensionality of the ideal points. I apply this model to the 107th U.S. House (2001 - 2003) and the 88th U.S. House (1963 - 1965) and show how modeling group dynamics improves the estimation and interpretation of ideal points. Similarly, I show that existing methods of ideal point estimation produce results that are substantively misaligned with historical studies of the U.S. Congress. In the third and final paper, I dive into the more general problem of Bayesian model comparison and marginal likelihood computation. Various methods of computing the marginal likelihood exist, such as importance sampling or variational methods, but they frequently provide inaccurate results. I demonstrate that point estimates for the marginal likelihood achieved using importance sampling are inaccurate in settings where the joint posterior is skewed. I propose a light extension to the variational method that treats the marginal likelihood as a random variable and create a set of intervals on the marginal likelihood which do not share the same inaccuracies. I show that these new intervals, called kappa bounds, provide a computationally efficient and accurate way to estimate the marginal likelihood under arbitrarily complex Bayesian model specifications. I show the superiority of kappa bounds estimates of the marginal likelihood through a series of simulated and real-world data examples, including comparing measurement models that estimate latent variables from ordered discrete survey data.PHDPolitical ScienceUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163023/1/kamcal_1.pd

    Validating corruption risk measures: a key step to monitoring SDG progress

    Full text link
    The Agenda 2030 recognises corruption as a major obstacle to sustainable development and integrates its reduction among SDG targets, in view of developing peaceful, just and strong institutions. In this paper, we propose a method to assess the validity of corruption indicators within an Item Response Theory framework, which explicitly accounts for the latent and multidimensional facet of corruption. Towards this main aim, a set of fifteen red flag indicators of corruption risk in public procurement is computed on data included in the Italian National Database of Public Contracts. Results show a multidimensional structure composed of sub-groups of red flag indicators i. measuring distinct corruption risk categories, which differ in nature, type and entity, and are generally non-superimposable; ii. mirroring distinct dynamics related to specific SDG principles and targets

    Construct truncation due to suboptimal person and item selection: consequences and potential solutions

    Get PDF
    Construct truncation can be defined as the failure to capture variation along the entire continuum of a construct reliably. It can occur due to suboptimal person selection or due to suboptimal item selection. In this thesis, I used a series of simulation studies coupled with real data examples to characterise the consequences of construct truncation on the inferences made in empirical research. The analyses suggested that construct truncation has the potential to result in significant distortions of substantive conclusions. Based on these analyses I developed recommendations for anticipating the circumstances under which construct truncation is likely to be problematic, identifying it when it occurs, and mitigating its adverse effects on substantive conclusions drawn from affected data
    corecore