research

Disentangling correlation between speed and ability at the subject level and between intensity and difficulty at the item level from psycholinguistic data: a joint modeling approach

Abstract

In psycholinguistic experiments multiple subjects are faced with multiple test items. Despite the early 70's paper of Clark (1973) arguing that averaging reaction times from such experiments over items for each subject and averaging over subjects for each item respectively, using these means in ANOVA-models (referred to as F1 and F2 statistics), and drawing inference from both statistics separately may be incorrect, the vast majority of published psycholinguistic results employed such techniques over the last decades. Baayen et al. (2008) explained in detail a mixed effects modeling approach with crossed random effects for subjects and items. In addition to the reaction times, psycholinguistic literature is often describing accuracy as well. Accuracy is then summarized by simple frequency tables exploiting the binary outcomes (i.e. correct or incorrect response) measured for each subject-item combination. To improve on this and allow for estimation of covariates effects on the accuracy, Jaeger (2008) introduced in the psycholinguistic literature a model for the probability of a correct answer. More specifically, he proposed a mixed logistic regression model that allows for crossed random subject and item effects along the lines of Baayen et al. (2008). Unfortunately, reaction times and accuracy are most often described separately without any concern being raised about their correlation. It is important to get a better understanding of the correlation between reaction times and accuracy, if any. The natural next step is therefore to consider a joint model for the reaction time and the accuracy. Joint modeling of these 2 outcomes can most easily be performed in a hierarchical framework. Van der Linden (2007) proposed an item-response theory model, a model for response time distribution and a higher-level structure accounting for the dependencies between the item and subjects parameters in these models. His hierachical framework is very exible in that any item-response or response time model can be substituted. Building on Van der Linden's work, we first provide a framework that combines the models introduced in the psycholinguistic literature by Baayen et al. (2008) and Jaeger (2008), treats subjects and items as random, and allows for correlation between reaction time and accuracy. The main advantage of this framework is its ability to disentangle between correlation driven by subjects and correlation driven by items. Estimation of the model parameters in the joint model and model checking are performed in a Bayesian approach with Markov Chain Monte Carlo (MCMC). The performance of the proposed methodology is illustrated with a real-data example

    Similar works