Search CORE

56 research outputs found

Using Connectionist Models to Evaluate Examinees’ Response Patterns to Achievement Tests

Author: Cui Ying
Gierl Mark J.
Hunka Steve
Publication venue: DigitalCommons@WayneState
Publication date: 01/05/2008
Field of study

The attribute hierarchy method (AHM) applied to assessment engineering is described. It is a psychometric method for classifying examinees’ test item responses into a set of attribute mastery patterns associated with different components in a cognitive model of task performance. Attribute probabilities, computed using a neural network, can be estimated for each examinee thereby providing specific information about the examinee’s attribute-mastery level. The pattern recognition approach described in this study relies on an explicit cognitive model to produce the expected response patterns. The expected response patterns serve as the input to the neural network. The model also yields the cognitive test specifications. These specifications identify the examinees’ attribute patterns which are used as output for the neural network. The purpose of the statistical pattern recognition analysis is to estimate the probability that an examinee possess specific attribute combinations based on their observed item response patterns. Two examples using student response data from a sample of algebra items on the SAT illustrate our pattern recognition approach

Crossref

Digital Commons@Wayne State University

Recommended from our members

Three Applications of Automated Test Assembly within a User-Friendly Modeling Environment

Author: Alves Cecilia
Cor Ken
Gierl Mark. J.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 23/11/2019
Field of study

While linear programming is a common tool in business and industry, there have not been many applications in educational assessment and only a handful of individuals have been actively involved in conducting psychometric research in this area. Perhaps this is due, at least in part, to the complexity of existing software packages. This article presents three applications of linear programming to automate test assembly using an add-in to Microsoft Excel 2007. These increasingly complex examples permit the reader to readily see and manipulate the programming objectives and constraints within a familiar modeling environment. A spreadsheet used in this demonstration is available for downloading. Accessed 12,243 times on https://pareonline.net from June 21, 2009 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right

ScholarWorks@UMass Amherst

Differential Validity and Utility of Successive and Simultaneous Approaches to the Development of Equivalent Achievement Tests in French and English

Author: Gierl Mark J.
Lin Jie
Rinaldi Christina
Rogers W. Todd
Tardif Claudette
Publication venue: 'University of Alberta'
Publication date: 01/10/2003
Field of study

Described in this article are the first three activities of a research program designed to assess the differential validity and utility of successive and simultaneous approaches to the development of equivalent achievement tests in the French and English languages. Two teams of multilingual/multicultural French-English teachers used the simultaneous approach to develop 70 items respectively for mathematics and social studies at the grade 9 level. The evidence gained from the pilot study suggests that the issue of differential item performance attributable to translation differences appears to be confounded by the presence of socioeconomic differences between the two groups of students. Consequently, the next activities of this research will be directed toward disentangling these two issues to obtain a clearer view of the efficacy of the simultaneous method in reducing differential group performance and enhancing linguistic and cultural decentering

University of Calgary Journal Hosting

Multiple-Choice Item Distractor Development Using Topic Modeling Approaches

Author: Jinnie Shin
Mark J. Gierl
Qi Guo
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2019
Field of study

Writing a high-quality, multiple-choice test item is a complex process. Creating plausible but incorrect options for each item poses significant challenges for the content specialist because this task is often undertaken without implementing a systematic method. In the current study, we describe and demonstrate a systematic method for creating plausible but incorrect options, also called distractors, based on students’ misconceptions. These misconceptions are extracted from the labeled written responses. One thousand five hundred and fifteen written responses from an existing constructed-response item in Biology from Grade 10 students were used to demonstrate the method. Using a topic modeling procedure commonly used with machine learning and natural language processing called latent dirichlet allocation, 22 plausible misconceptions from students’ written responses were identified and used to produce a list of plausible distractors based on students’ responses. These distractors, in turn, were used as part of new multiple-choice items. Implications for item development are discussed

Directory of Open Access Journals

Validation of the conceptual research utilization scale: an application of the standards for educational and psychological testing in healthcare

Abstract Background There is a lack of acceptable, reliable, and valid survey instruments to measure conceptual research utilization (CRU). In this study, we investigated the psychometric properties of a newly developed scale (the CRU Scale). Methods We used the <it>Standards for Educational and Psychological Testing </it>as a validation framework to assess four sources of validity evidence: content, response processes, internal structure, and relations to other variables. A panel of nine international research utilization experts performed a formal content validity assessment. To determine response process validity, we conducted a series of one-on-one scale administration sessions with 10 healthcare aides. Internal structure and relations to other variables validity was examined using CRU Scale response data from a sample of 707 healthcare aides working in 30 urban Canadian nursing homes. Principal components analysis and confirmatory factor analyses were conducted to determine internal structure. Relations to other variables were examined using: (1) bivariate correlations; (2) change in mean values of CRU with increasing levels of other kinds of research utilization; and (3) multivariate linear regression. Results Content validity index scores for the five items ranged from 0.55 to 1.00. The principal components analysis predicted a 5-item 1-factor model. This was inconsistent with the findings from the confirmatory factor analysis, which showed best fit for a 4-item 1-factor model. Bivariate associations between CRU and other kinds of research utilization were statistically significant (p < 0.01) for the latent CRU scale score and all five CRU items. The CRU scale score was also shown to be significant predictor of overall research utilization in multivariate linear regression. Conclusions The CRU scale showed acceptable initial psychometric properties with respect to responses from healthcare aides in nursing homes. Based on our validity, reliability, and acceptability analyses, we recommend using a reduced (four-item) version of the CRU scale to yield sound assessments of CRU by healthcare aides. Refinement to the wording of one item is also needed. Planned future research will include: latent scale scoring, identification of variables that predict and are outcomes to conceptual research use, and longitudinal work to determine CRU Scale sensitivity to change.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Methodology for Multilingual Automatic Item Generation

Author: Gierl Mark J.
Lai Hollis
Publication venue: 'Consortium Erudit'
Publication date: 01/01/2015
Field of study

Testing agencies require large numbers of high-quality items that are produced in a cost-effective and timely manner. Increasingly, these agencies also require items in different languages. In this paper we present a methodology for multilingual automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer technology. We describe a three-step AIG approach where, first, test development specialists identify the content that will be used for item generation. Next, the specialists create item models to specify the content in the assessment task that must be manipulated to produce new items. Finally, elements in the item model are manipulated with computer algorithms to produce new items. Language is added in the item model step to permit multilingual AIG. We illustrate our method by generating 360 English and 360 French medical education items. The importance of item banking in multilingual test development is also discussed.Les agences d’évaluation ont besoin d’un grand nombre d’items de première qualité produits de façon rapide et économique, et de plus en plus souvent dans différentes langues. Dans cet article, une méthodologie de génération automatique d’items (AIG) multilingues est proposée. L’AIG correspond au processus d’utilisation de modèles d’items dans le but de générer les items d’un test à l’aide de la technologie informatique. Une approche AIG en trois étapes est décrite, dans laquelle les spécialistes en développement de test doivent d’abord identifier le contenu qui sera utilisé pour générer les items. Par la suite, ces spécialistes créent des modèles d’items afin de préciser le contenu de la tâche d’évaluation qui doit être manipulée pour produire de nouveaux items. Enfin, les éléments du modèle d’items sont manipulés à l’aide d’algorithmes informatiques pour générer de nouveaux items. L’ajout des langues désirées à l’étape de création des modèles d’items permet d’effectuer une génération automatique d’items multilingues. Cette méthode est illustrée en générant 360 items en français et 360 items en anglais dans le domaine de la formation médicale. L’importance de créer des banques d’items lors du développement de tests multilingues est également discutée.As agências de avaliação precisam de um grande número de itens de primeira qualidade produzidos de forma rápida e económica, e, cada vez mais, em diferentes línguas. Neste artigo, é proposta uma metodologia para a geração automática de itens (AIG) multilingues. A AIG é o processo de utilização de modelos de itens com a finalidade de gerar itens de um teste com o apoio da tecnologia informática. Descreve-se uma abordagem AIG em três etapas, na qual os especialistas em desenvolvimento de testes devem identificar, desde logo, o conteúdo que será utilizado para gerar os itens. De seguida, estes especialistas criam os modelos de itens para especificar o conteúdo da tarefa de avaliação que deve ser manipulado para produzir novos itens. Finalmente, os elementos do modelo de itens são manipulados usando algoritmos informáticos para gerar novos itens. Adicionando as línguas desejadas à etapa de criação de modelos de itens é possível efetuar a geração automática de itens multilingues. Este método é ilustrado através da geração de 360 itens em francês e 360 itens em inglês no campo da formação médica. Discute-se também a importância da criação de bancos de itens no desenvolvimento de testes multilingues

Crossref

Directory of Open Access Journals

Érudit

The learning sciences in educational assessment : the role of cognitive models

Author: Gierl Mark J.
Leighton Jacqueline P.
Publication venue: Cambride University Press
Publication date: 01/01/2014
Field of study

270 hlm.; 23 cm

library.uny.ac.id