Search CORE

University of Strathclyde Institutional Repository

Aberdeen University Research

Exploring differential item functioning in the Western Ontario and McMaster Universities osteoarthritis index (WOMAC)

Author: AJ Dallmeijer
AJ Perkins
AM Davis
BD Zumbo
BD Zumbo
BD Zumbo
Beth Pollard
C Lewis
CA McHorney
DA Rothenfluh
Diane Dixon
DJ Cooke
F Wolfe
H Swaminathan
J Eachus
JA Teresi
JA Teresi
JA Teresi
JH Kellgren
LD Roorda
Marie Johnston
N Bellamy
NW Scott
P Juni
P Kersten
PK Crane
PK Crane
RH Osborne
RK Hambleton
RO Anderson
SP McKenna
T Brockow
T Covic
T Nijsten
U Lorezo-Seva
WF Velicer
WJ Taylor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Background: The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF). That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods: The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results: After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions: Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the possible impact of the DIF item when age comparisons are of primary interest

University of Strathclyde Institutional Repository

Directory of Open Access Journals

Observing response processes with eye tracking in international large-scale assessments: evidence from the OECD PIAAC assessment

Author: A Glenberg
Andrew P. Bayliss
BD Zumbo
Bryan Maddox
EF Risko
EJ Paulson
F Goldhamer
F Goldhammer
Francesca Borgonovi
G Doherty-Sneddon
J Beatty
Kennedy
M Lai
OJ Solheim
Paul E. Engelhardt
Piers Fleming
R Bixler
RH Tai
S D’Mello
S Liversedge
S. Gareth Edwards
TL Varao-Sousa
Y Hu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/05/2018
Field of study

This paper reports on a pilot study that used eye tracking techniques to make detailed observations of item response processes in the OECD Programme for the International Assessment of Adult Competencies (PIAAC). The lab-based study also recorded physiological responses using measures of pupil diameter and electrodermal activity. The study tested 14 adult respondents as they individually completed the PIAAC computer-based assessment. The eye tracking observations help to fill an ‘explanatory gap’ by providing data on variation in item response processes that are not captured by other sources of process data such as think aloud protocols or computer-generated log files. The data on fixations and saccades provided detailed information on test item response strategies, enabling profiling of respondent engagement and response processes associated with successful performance. Much of that activity does not include the use of the keyboard and mouse, and involves ‘off-screen’ use of pen and paper (and calculator) that are not captured by assessment log-files. In conclusion, this paper points toward an important application of eye tracking in large-scale assessments. This includes insights into response processes in new domains such as adaptive problem-solving that aim to identify individuals’ ability to select and combine resources from the digital and physical environment

University of East Anglia digital repository

University of Queensland eSpace

Comparison of the sensitivity of the UKCAT and A levels to sociodemographic characteristics: a national study

Author: A Smithers
AI Rothman
B McKinstry
B Wu
BD Zumbo
CD Kreiter
D James
H Coates
I Haq
I McManus
J Adams
J Guthke
J Mathers
J Tobin
J Yates
JF Beckmann
JM Davidson
John C McLachlan
JP Rushton
Lisa Webster
M Komaromy
NG Dewhurst
PA Tiffin
Paul A Tiffin
RG Miller Jr
S Schwartz
Sandra Nicholson
SR Wright
WC McGaghie
Publication venue: BioMed Central
Publication date: 08/01/2014
Field of study

Background: The UK Clinical Aptitude Test (UKCAT) was introduced to facilitate widening participation in medical and dental education in the UK by providing universities with a continuous variable to aid selection; one that might be less sensitive to the sociodemographic background of candidates compared to traditional measures of educational attainment. Initial research suggested that males, candidates from more advantaged socioeconomic backgrounds and those who attended independent or grammar schools performed better on the test. The introduction of the A* grade at A level permits more detailed analysis of the relationship between UKCAT scores, secondary educational attainment and sociodemographic variables. Thus, our aim was to further assess whether the UKCAT is likely to add incremental value over A level (predicted or actual) attainment in the selection process. Methods: Data relating to UKCAT and A level performance from 8,180 candidates applying to medicine in 2009 who had complete information relating to six key sociodemographic variables were analysed. A series of regression analyses were conducted in order to evaluate the ability of sociodemographic status to predict performance on two outcome measures: A level ‘best of three’ tariff score; and the UKCAT scores. Results: In this sample A level attainment was independently and positively predicted by four sociodemographic variables (independent/grammar schooling, White ethnicity, age and professional social class background). These variables also independently and positively predicted UKCAT scores. There was a suggestion that UKCAT scores were less sensitive to educational background compared to A level attainment. In contrast to A level attainment, UKCAT score was independently and positively predicted by having English as a first language and male sex. Conclusions: Our findings are consistent with a previous report; most of the sociodemographic factors that predict A level attainment also predict UKCAT performance. However, compared to A levels, males and those speaking English as a first language perform better on UKCAT. Our findings suggest that UKCAT scores may be more influenced by sex and less sensitive to school type compared to A levels. These factors must be considered by institutions utilising the UKCAT as a component of the medical and dental school selection process

Durham Research Online

CLoK

Queen Mary Research Online

Development and preliminary validation of a questionnaire to measure satisfaction with home care in Greece: an exploratory factor analysis of polychoric correlations

Author: A Hendriks
A Merkouris
A Netten
A Roscino
A Scott
A Wilson
AB Costello
AE Norris
Anna Nicolaou
Arsenis Kostarelis
B Rapkin
BD Zumbo
BL Westra
CA Woodward
CO Long
DA Forbes
DE Mylod
Dimitris Niakas
DL Streiner
E Wouters
EJ Porter
FP Holgado
G Samuelsson
G Willis
H Jayasekara
J Capitman
J Francis
J Labarere
J Labarere
JB Bjorner
JC Nunnally
JE Ware
JH Steiger
K Jones
KG Jöreskog
KG Jöreskog
KH Dansky
KO McGraw
L Woodruff
LJ Cronbach
LR Fabrigar
M Bear
M Westaway
M Yen
MA Pett
Maria Tsitouridou
Ministry of Economy & Finance and Ministry of Health & Welfare
Ministry of Employment & Social Protection
Ministry of Health and Social Solidarity
Ministry of the Interior Public Administration and Decentralisation Ministry of Employment & Social Protection and Ministry of Health & Welfare
P Kline
PG Edebalk
PJ Reeder
R Laferriere
R Muenz
S Ferketich
S Nedjat
SL Fiebelkorn
SM Geron
SM Geron
V Raftopoulos
Vassilis H Aletras
VH Aletras
VW Barrett
WJ Krowinski
World Health Organization
Z Birhanu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The primary aim of this study was to develop and psychometrically test a Greek-language instrument for measuring satisfaction with home care. The first empirical evidence about the level of satisfaction with these services in Greece is also provided. Methods The questionnaire resulted from literature search, on-site observation and cognitive interviews. It was applied in 2006 to a sample of 201 enrollees of five home care programs in the city of Thessaloniki and contains 31 items that measure satisfaction with individual service attributes and are expressed on a 5-point Likert scale. The latter has been usually considered in practice as an interval scale, although it is in principle ordinal. We thus treated the variable as an ordinal one, but also employed the traditional approach in order to compare the findings. Our analysis was therefore based on ordinal measures such as the polychoric correlation, Kendall's Tau b coefficient and ordinal Cronbach's alpha. Exploratory factor analysis was followed by an assessment of internal consistency reliability, test-retest reliability, construct validity and sensitivity. Results Analyses with ordinal and interval scale measures produced in essence very similar results and identified four multi-item scales. Three of these were found to be reliable and valid: socioeconomic change, staff skills and attitudes and service appropriateness. A fourth dimension -service planning- had lower internal consistency reliability and yet very satisfactory test-retest reliability, construct validity and floor and ceiling effects. The global satisfaction scale created was also quite reliable. Overall, participants were satisfied -yet not very satisfied- with home care services. More room for improvement seems to exist for the socio-economic and planning aspects of care and less for staff skills and attitudes and appropriateness of provided services. Conclusions The methods developed seem to be a promising tool for the measurement of home care satisfaction in Greece.</p

Directory of Open Access Journals

Computerized adaptive testing of population psychological distress : simulation-based evaluation of GHQ-30

Author: BD Zumbo
BF Green
C Cooper
C Geiser
D Hooper
D Magis
DM Dimitrov
DP Goldberg
Ed Leeuw
FA Huppert
H Fliege
J Blair
J Hartig
J Walker
JR Böhnke
JR Böhnke
L Hu
M Hankins
M Romppel
MR Robling
OB Walter
RD Gibbons
RD Gibbons
RP Chalmers
RP McDonald
S Choi
S Pohl
S Ye
SP Reise
SW Choi
W-C Wang
WH Emons
WR Dillon
Y Rosseel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

PURPOSE: Goldberg's General Health Questionnaire (GHQ) items are frequently used to assess psychological distress but no study to date has investigated the GHQ-30's potential for adaptive administration. In computerized adaptive testing (CAT) items are matched optimally to the targeted distress level of respondents instead of relying on fixed-length versions of instruments. We therefore calibrate GHQ-30 items and report a simulation study exploring the potential of this instrument for adaptive administration in a longitudinal setting. METHODS: GHQ-30 responses of 3445 participants with 2 completed assessments (baseline, 7-year follow-up) in the UK Health and Lifestyle Survey were calibrated using item response theory. Our simulation study evaluated the efficiency of CAT administration of the items, cross-sectionally and longitudinally, with different estimators, item selection methods, and measurement precision criteria. RESULTS: To yield accurate distress measurements (marginal reliability at least 0.90) nearly all GHQ-30 items need to be administered to most survey respondents in general population samples. When lower accuracy is permissible (marginal reliability of 0.80), adaptive administration saves approximately 2/3 of the items. For longitudinal applications, change scores based on the complete set of GHQ-30 items correlate highly with change scores from adaptive administrations. CONCLUSIONS: The rationale for CAT-GHQ-30 is only supported when the required marginal reliability is lower than 0.9, which is most likely to be the case in cross-sectional and longitudinal studies assessing mean changes in populations. Precise measurement of psychological distress at the individual level can be achieved, but requires the deployment of all 30 items