3 research outputs found
Recommended from our members
You are what you click: Using machine learning to model trace data for psychometric measurement
Assessment trace data, such as mouse positions and their timing, offer interesting and provocative reflections of individual differences yet are currently underutilized by testing professionals. In this article, we present a 10-step procedure to maximize the probability that a trace data modeling project will be successful: 1) grounding the project in psychometric theory, 2) building technical infrastructure to collect trace data, 3) designing a useful developmental validation study, 4) using a holdout validation approach with collected data, 5) using exploratory analysis to conduct meaningful feature engineering, 6) identifying useful machine learning algorithms to predict a thoughtfully chosen criterion, 7) engineering a machine learning model with meaningful internal cross-validation and hyperparameter selection, 8) conducting model diagnostics to assess if the resulting model is overfitted, underfitted, or within acceptable tolerance, and 9) testing the success of the final model in meeting conceptual, technical, and psychometric goals. If deemed successful, trace data model predictions could then be engineered into decision-making systems. We present this framework within the broader view of psychometrics, exploring the challenges of developing psychometrically valid models using such complex data with much weaker trait signals than assessment developers have typically attempted to model
Theory-driven Game-based Assessment of General Cognitive Ability: Design Theory, Measurement, Prediction of Performance, and Test Fairness
Games, which can be defined as an externally structured, goal-directed type of play, are increasingly being used in high-stakes testing contexts to measure targeted constructs for use in the selection and promotion of employees. Despite this increasing popularity, little is known about how theory-driven game-based assessments (GBA), those designed to reflect a targeted construct, should be designed, or their potential for achieving their simultaneous goals of positive reactions and high-quality psychometric measurement. In the present research, we develop a theory of GBA design by integrating game design and development theory from human-computer interaction with psychometric theory. Next, we test measurement characteristics, prediction of performance, fairness, and reactions of a GBA designed according to this theory to measure latent general intelligence (g). Using an academic sample with GPA data (N=633), we demonstrate convergence between latent GBA performance and g ( = .97). Adding an organizational sample with supervisory ratings of job performance (N=49), we show GBA prediction of both GPA (r=.16) and supervisory ratings (r=.29). We also show incremental prediction of GPA using unit-weighted composites of the g test battery beyond that of the g-GBA battery but not the reverse. We also show the presence of similar adverse impact for both the traditional test battery and GBA but the absence of differential prediction of criteria. Reactions were more positive across all measures for the g-GBA compared to the traditional test battery. Overall, results support GBA design theory as a promising foundation from which to build high quality theory-driven GBAs