Search CORE

24 research outputs found

Genetic algorithms for automated test assembly

Author: Verschoor Angela Jose
Publication venue: University of Twente
Publication date: 01/01/2007
Field of study

University of Twente Research Information

A multiple objective test assembly approach for exposure control problems in computerized adaptive testing

Author: Eggen Theo J.H.M.
Veldkamp Bernard P.
Verschoor Angela J.
Publication venue: University of Valencia
Publication date: 01/01/2010
Field of study

Overexposure and underexposure of items in the bank are serious problems in operational computerized adaptive testing (CAT) systems. These exposure problems might result in item compromise, or point at a waste of investments. The exposure control problem can be viewed as a test assembly problem with multiple objectives. Information in the test has to be maximized, item compromise has to be minimized, and pool usage has to be optimized. In this paper, a multiple objectives method is developed to deal with both types of exposure problems. In this method, exposure control parameters based on observed exposure rates are implemented as weights for the information in the item selection procedure. The method does not need time consuming simulation studies, and it can be implemented conditional on ability level. The method is compared with Sympson Hetter method for exposure control, with the Progressive method and with alphastratified testing. The results show that the method is successful in dealing with both kinds of exposure problems

Directory of Open Access Journals

University of Twente Research Information

Improvement of Measurement Efficiency in Multistage Tests by Targeted Assignment

Author: Angela J. Verschoor
Stéphanie Berger
Stéphanie Berger
Theo J. H. M. Eggen
Theo J. H. M. Eggen
Urs Moser
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

A good match between item difficulty and student ability ensures efficient measurement and prevents students from becoming discouraged or bored by test items that are too easy or too difficult. Targeted test designs consider ability-related background variables to assign students to matching test forms. However, these designs do not consider that students might significantly differ in ability within the resulting groups. In contrast, multistage test designs consider students' performance during test taking to route them to the most informative modules. Yet, multistage test designs usually include one starting module of moderate difficulty in the first stage, which does not account for differences in ability. In this paper, we investigated whether measurement efficiency can be improved by targeted multistage test designs that consider ability-related background information for a targeted assignment at the beginning of the test and performance during test taking for selecting matching test modules. By means of simulations, we compared the efficiency of the traditional targeted test design, the multistage test (MST) design, and the targeted multistage test (TMST) design for estimating student ability. Furthermore, we analyzed the extent to which the efficiency of the different designs depends on the correlation between the ability-related background variable and the true ability, students' ability level and their categorization into an ability group, and the length of the starting module. The results indicated that TMST designs were generally more efficient for estimating student ability than targeted test designs and MST designs, especially if the ability-related background variable correlated high with and, thus, was a good indicator of, students' true ability. Furthermore, TMST designs were particularly efficient in estimating abilities for low- and high-ability students within a given population. Finally, very long starting modules resulted in less efficient estimation of low and high abilities than shorter starting modules. However, this finding was more prominent for MST than for TMST designs. In conclusion, TMST designs are recommended for assessing students from a wide ability distribution if a reliable ability-related background variable is available

Directory of Open Access Journals

Extracellular MRP8/14 is a regulator of β2 integrin-dependent neutrophil slow rolling and adhesion

Author: Bieber Stephanie
Bierschenk Susanne
Cao-Ehlker Xiao
Chavakis Triantafyllos
Chung Kyoung-Jin
Eggersmann Tanja K.
Gran Sandra
Heinig Kristina
Immler Roland
Koedel Uwe
Kurz Angela R. M.
Leanderson Tomas
McEver Rodger P.
Moser Markus
Nussbaum Claudia F.
Pruenster Monika
Rohwedder Ina
Roth Johannes
Sperandio Markus
Verschoor Admar
Vestweber Dietmar
Vogl Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Myeloid-related proteins (MRPs) 8 and 14 are cytosolic proteins secreted from myeloid cells as proinflammatory mediators. Currently, the functional role of circulating extracellular MRP8/14 is unclear. Our present study identifies extracellular MRP8/14 as an autocrine player in the leukocyte adhesion cascade. We show that E-selectin-PSGL-1 interaction during neutrophil rolling triggers Mrp8/14 secretion. Released MRP8/14 in turn activates a TLR4-mediated, Rap1-GTPase-dependent pathway of rapid beta 2 integrin activation in neutrophils. This extracellular activation loop reduces leukocyte rolling velocity and stimulates adhesion. Thus, we identify Mrp8/14 and TLR4 as important modulators of the leukocyte recruitment cascade during inflammation in vivo

Extracellular MRP8/14 is a regulator of β2 integrin-dependent neutrophil slow rolling and adhesion

Author: Bieber Stephanie
Bierschenk Susanne
Cao-Ehlker Xiao
Chavakis Triantafyllos
Chung Kyoung-Jin
Eggersmann Tanja K.
Gran Sandra
Heinig Kristina
Immler Roland
Koedel Uwe
Kurz Angela R. M.
Leanderson Tomas
McEver Rodger P.
Moser Markus
Nussbaum Claudia F.
Pruenster Monika
Rohwedder Ina
Roth Johannes
Sperandio Markus
Verschoor Admar
Vestweber Dietmar
Vogl Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Lund University Publications

Crossref

Open Access LMU

PubMed Central

Many Labs 2: Investigating Variation in Replicability Across Samples and Settings

Author: Adams Byron G
Adams Reginald B
Alper Sinan
Aveyard Mark
Axt Jordan R
Babalola Mayowa T
Bahník Štěpán
Batra Rishtee
Berkics Mihály
Bernstein Michael J
Berry Daniel R
Bialobrzeska Olga
Binan Evans Dami
Bocian Konrad
Brandt Mark J
Busching Robert
Cai Huajian
Cambier Fanny
Cantarero Katarzyna
Carmichael Cheryl L
Ceric Francisco
Chandler Jesse
Chang Jen-Ho
Chatard Armand
Chen Eva E
Cheong Winnee
Cicero David C
Coen Sharon
Coleman Jennifer A
Collisson Brian
Conway Morgan A
Corker Katherine S
Curran Paul G
Cushman Fiery
Dagona Zubairu K
Dalgar Ilker
Dalla Rosa Anna
Davis William E
de Bruijn Maaike
De Schutter Leander
de Vries Marieke
Devos Thierry
Diego Vega Luis
Dozo Nerisa
Doğulu Canay
Dukes Kristin Nicole
Dunham Yarrow
Durrheim Kevin
Ebersole Charles R
Edlund John E
Eller Anja
English Alexander Scott
Finck Carolyn
Frankowska Natalia
Freyre Miguel-Ángel
Friedman Mike
Galliani Elisa Maria
Gandi Joshua C
Ghoshal Tanuka
Giessner Steffen R
Gill Tripat
Gnambs Timo
González Roberto
Graham Jesse
Grahe Jon E
Grahek Ivan
Green Eva GT
Gómez Ángel
Hai Kakul
Haigh Matthew
Haines Elizabeth L
Hall Michael P
Hasselman Fred
Heffernan Marie E
Hicks Joshua A
Houdek Petr
Huntsinger Jeffrey R
Huynh Ho Phi
Ijzerman Hans
Inbar Yoel
Innes-Ker Åse H
Jiménez-Leal William
John Melissa-Sue
Joy-Gaba Jennifer A
Kamiloğlu Roza G
Kappes Heather Barry
Karabati Serdar
Karick Haruna
Keller Victor N
Kende Anna
Kervyn Nicolas
Klein Richard A
Knežević Goran
Kovacs Carrie
Krueger Lacy E
Kurapov German
Kurtz Jamie
Lakens Daniël
Lazarević Ljiljana B
Lee Nichols Austin
Levitan Carmel A
Lewis Jr. Neil A
Lins Samuel
Lipsey Nikolette P
Losee Joy E
Maassen Esther
Maitner Angela T
Malingumu Winfrida
Mallett Robyn K
Marotta Satia A
Mena-Pacheco Fernando
Međedović Janko
Milfont Taciano L
Morris Wendy L
Murphy Sean C
Myachykov Andriy
Neave Nick
Neijenhuijs Koen
Nelson Anthony J
Neto Félix
Nosek Brian A
Ocampo Aaron
Oikawa Haruka
Oikawa Masanori
Ong Elsie
Orosz Gábor
Osowiecka Malgorzata
O’Donnell Susan L
Packard Grant
Petrović Boban
Pilati Ronaldo
Pinter Brad
Podesta Lysandra
Pogge Gabrielle
Pollmann Monique MH
Pérez-Sánchez Rolando
Rutchick Abraham M
Rédei Anna Cabak
Saavedra Patricio
Saeri Alexander K
Salomon Erika
Schmidt Kathleen
Schönbrodt Felix D
Sekerdej Maciej B
Sirlopú David
Skorinko Jeanine LM
Smith Michael A
Smith-Castro Vanessa
Smolders Karin CHJ
Sobkow Agata
Sowden Walter
Spachtholz Philipp
Srivastava Manini
Steiner Troy G
Stouten Jeroen
Street Chris NH
Sundfelt Oskar K
Szeto Stephanie
Szumowska Ewa
Tang Andrew CW
Tanzer Norbert
Tear Morgan J
Theriault Jordan
Thomae Manuela
Torres David
Traczyk Jakub
Tybur Joshua M
Ujhelyi A
Ujhelyi Adrienn
van Aert Robbie CM
van Assen Marcel ALM
van der Hulst Marije
van Lange Paul AM
van ’t Veer Anna Elisabeth
Vaughn Leigh Ann
Verniers Catherine
Verschoor Mark
Vianello Michelangelo
Voermans Ingrid PJ
Vranka Marek A
Vásquez- Echeverría Alejandro
Vázquez Alexandra
Welch Cheryl
Wichman Aaron L
Williams Lisa A
Wood Michael
Woodzicka Julie A
Wronska Marta K
Young Liane
Zelenski John M
Zhijia Zeng
Publication venue: 'SAGE Publications'
Publication date: 01/01/2018
Field of study

We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p < .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p < .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely highpowered design. Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones. The median comparable Cohen’s ds were 0.60 for the original findings and 0.15 for the replications. The effect sizes were small (< 0.20) in 16 of the replications (57%), and 9 effects (32%) were in the direction opposite the direction of the original effect. Across settings, the Q statistic indicated significant heterogeneity in 11 (39%) of the replication effects, and most of those were among the findings with the largest overall effect sizes; only 1 effect that was near zero in the aggregate showed significant heterogeneity according to this measure. Only 1 effect had a tau value greater than .20, an indication of moderate heterogeneity. Eight others had tau values near or slightly above .10, an indication of slight heterogeneity. Moderation tests indicated that very little heterogeneity was attributable to the order in which the tasks were performed or whether the tasks were administered in lab versus online. Exploratory comparisons revealed little heterogeneity between Western, educated, industrialized, rich, and democratic (WEIRD) cultures and less WEIRD cultures (i.e., cultures with relatively high and low WEIRDness scores, respectively). Cumulatively, variability in the observed effect sizes was attributable more to the effect being studied than to the sample or setting in which it was studied.UCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias Sociales::Instituto de Investigaciones Psicológicas (IIP

Flexibility at the Price of Volatility: Concurrent Calibration in Multistage Tests in Practice Using a 2PL Model

Author: Berger Stéphanie
Helbling Laura Alexandra
Verschoor Angela
Publication venue: 'Frontiers Media SA'
Publication date: 28/05/2021
Field of study

Multistage test (MST) designs promise efficient student ability estimates, an indispensable asset for individual diagnostics in high-stakes educational assessments. In high-stakes testing, annually changing test forms are required because publicly known test items impair accurate student ability estimation, and items of bad model fit must be continually replaced to guarantee test quality. This requires a large and continually refreshed item pool as the basis for high-stakes MST. In practice, the calibration of newly developed items to feed annually changing tests is highly resource intensive. Piloting based on a representative sample of students is often not feasible, given that, for schools, participation in actual high-stakes assessments already requires considerable organizational effort. Hence, under practical constraints, the calibration of newly developed items may take place on the go in the form of a concurrent calibration in MST designs. Based on a simulation approach this paper focuses on the performance of Rasch vs. 2PL modeling in retrieving item parameters when items are for practical reasons non-optimally placed in multistage tests. Overall, the results suggest that the 2PL model performs worse in retrieving item parameters compared to the Rasch model when there is non-optimal item assembly in the MST; especially in retrieving parameters at the margins. The higher flexibility of 2PL modeling, where item discrimination is allowed to vary, seems to come at the cost of increased volatility in parameter estimation. Although the overall bias may be modest, single items can be affected by severe biases when using a 2PL model for item calibration in the context of non-optimal item placement

ZORA

On-the-fly calibration in computerized adaptive testing

Author: Berger Stéphanie
Kleintjes Frans
Moser Urs
Verschoor Angela
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Research on Computerized Adaptive Testing (CAT) has been rooted in a long tradition. Yet, current operational requirements for CATs make the production a relatively expensive and time consuming process. Item pools need a large number of items, each calibrated with a large degree of accuracy. Using on-the-fly calibration might be the answer to reduce the operational demands for the production of a CAT. As calibration is to take place in real time, a fast and simple calibration method is needed. Three methods will be considered: Elo chess ratings, Joint Maximum Likelihood (JML), and Marginal Maximum Likelihood (MML). MML is the most time consuming method, but the only known method to give unbiased parameter estimates when calibrating CAT data. JML gives biased estimates although it is faster than MML, while the updating of Elo ratings is known to be even less time consuming. Although JML would meet operational requirements for running a CAT regarding computational performance, its bias and its inability to estimate parameters for perfect or zero scores makes it unsuitable as a strategy for on-the-fly calibration. In this chapter, we propose a combination of Elo rating and JML as a strategy that meets the operational requirements for running a CAT. The Elo rating is used in the very beginning of the administration to ensure that the JML estimation procedure is converging for all answer patterns. Modelling the bias in JML with the help of a relatively small, but representative set of calibrated items is proposed to eliminate the bias

ZORA

Infeasibility in automatic test assembly models: a comparison study of different methods

Author: Huitzing Hiddo A.
Veldkamp Bernard P.
Verschoor A.J.
Verschoor Angela J.
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

Several techniques exist to automatically put together a test meeting a number of specifications. In an item bank, the items are stored with their characteristics. A test is constructed by selecting a set of items that fulfills the specifications set by the test assembler. Test assembly problems are often formulated in terms of a model consisting of restrictions and an objective to be maximized or minimized. A problem arises when it is impossible to construct a test from the item pool that meets all specifications, that is, when the model is not feasible. Several methods exist to handle these infeasibility problems. In this article, test assembly models resulting from two practical testing programs were reconstructed to be infeasible. These models were analyzed using methods that forced a solution (Goal Programming, Multiple-Goal Programming, Greedy Heuristic), that analyzed the causes (Relaxed and Ordered Deletion Algorithm (RODA), Integer Randomized Deletion Algorithm (IRDA), Set Covering (SC), and Item Sampling), or that analyzed the causes and used this information to force a solution (Irreducible Infeasible Set-Solver). Specialized methods such as the IRDA and the Irreducible Infeasible Set-Solver performed best. Recommendations about the use of different methods are given

University of Twente Research Information