115 research outputs found
Rapporto tecnico sugli esiti della prova nazionale nell'ambito dell'Esame di Stato al termine del primo ciclo anno 2007-2008. Analisi delle risposte al test di matematica e italiano: dalle proprietĂ Â delle domande alla valutazione degli studenti.
Il rapporto presenta i risultati dell'analisi statistiche effettuate sulle risposte date dagli studenti al'Esame di Stato della scuola secondaria di primo grado, classe terza, anno 2007-2008. I dati forniti dall'Invalsi riguardano le risposte fornite alle prove standardizzate di italiano e matematica
La Validazione Statistica di test standardizzati di profitto: principali aspetti di metodo e due casi di studio sulla valutazione degli apprendimenti nella scuola primaria
Il lavoro si propone di ripercorrere alcune metodologie generali di analisi dei test per la valutazione degli apprendimenti, discutendo i risultati ottenuti in due casi di studio riguardanti le prove preparate dal Servizio Nazionale di Valutazione (SNV) dell’INVALSI per la classe seconda della scuola primaria. In particolare, viene descritto il processo di analisi dei pre-test attraverso l’utilizzo congiunto degli indicatori derivanti dalla Classical Test Theory e dei modelli di Item Response Theory
On the use of MCMC CAT with empirical prior information to improve the efficiency of CAT
In this paper, empirical prior information about the candidate is applied in computerized adaptive testing (CAT). The main objective of CAT is to improve efficiency of test administration. In this paper, it is shown how the inclusion of background variables both in the initialization and the ability estimation is able to improve the accuracy of ability estimates. In particular, a Gibbs sampler scheme is proposed in the phases of interim and final ability estimation. By using both simulated and real data, it is demonstrated that the method produces more accurate ability estimates, especially for short tests and when reproducing boundary abilities. This implies that operational problems of CAT related to weak measurement precision under particular conditions, can be reduced as well. In the empirical example, the methods were applied to CAT for intelligence testing in the area of personnel selection. Other promising applications would be in the medical world, where testing efficiency is of paramount importance as well
Dealing with multiple criteria in test assembly
It is quite common that tests or exams are being used for more then one purpose. First of all, they are used to measure the ability of the students in a reliable manner. Besides, they can be used for pass/fail decisions or to predict future behavior of the candidate, like future job behavior or academic performance. The question remains how to assemble a test that can be used for all these different purposes, that is, how to assemble a multi-objective test. Besides, multiple objectives can result from different purposes, but also from the way test specifications have been implemented. For the WDM-model, for multidimensional IRT, for Cognitive Diagnostic CAT, but also for infeasibility analysis, multiple objective test assembly problems have to be solved.
In this paper, a 2-stage method is presented for dealing with multiple objectives in test assembly. In the normalization stage, all objectives are brought on a common scale. In the valorization stage, the different objectives are being compared and related to each other. The method is applied to a Guidance Test developed at the University of Bologna, and a comparison is made with more traditional single objective test assembly methods. The results clearly demonstrate the importance and relevance of multi-objective test assembly
Computer adaptive testing with empirical prior information: a Gibbs sampler approach for ability estimation
In this paper, empirical prior information is introduced in computer adaptive testing. Despite its increasing use, the method suffers from a weak measurement precision, especially under particular conditions. Therefore, it is shown how the inclusion of background variables both in the initialization and the ability estimation is able to improve the accuracy of ability estimates. In particular, a Gibbs sampler scheme is proposed in the phases of interim and final ability estimation. By using simulated data, it is demonstrated that the method produces more accurate ability estimates, especially for short tests and when reproducing boundary abilities
The use of predicted values for item parameters in item response theory models: an application in intelligence tests
In testing, item response theory models are widely used in order to estimate item parameters and individual abilities. However, even unidimensional models require a considerable sample size so that all parameters can be estimated precisely. The introduction of empirical prior information about candidates and items might reduce the number of candidates needed for parameter estimation. Using data for IQ measurement, this work shows how empirical information about items can be used effectively for item calibration and in adaptive testing. First, we propose multivariate regression trees to predict the item parameters based on a set of covariates related to the item solving process. Afterwards, we compare the item parameter estimation when tree fitted values are included in the estimation or when they are ignored. Model estimation is fully Bayesian, and is conducted via Markov chain Monte Carlo methods. The results are two-fold: a) in item calibration, it is shown that the introduction of prior information is effective with short test lengths and small sample sizes, b) in adaptive testing, it is demonstrated that the use of the tree fitted values instead of the estimated parameters leads to a moderate increase in the test length, but provides a considerable saving of resources
- …