138 research outputs found

    Exploring differential item functioning in the SF-36 by demographic, clinical, psychological and social factors in an osteoarthritis population

    Get PDF
    The SF-36 is a very commonly used generic measure of health outcome in osteoarthritis (OA). An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items work in the same way across subgroup of a population. That is, if respondents have the same 'true' level of outcome, does the item give the same score in different subgroups or is it biased towards one subgroup or another. Differential item functioning (DIF) can identify items that may be biased for one group or another and has been applied to measuring patient reported outcomes. Items may show DIF for different conditions and between cultures, however the SF-36 has not been specifically examined in an osteoarthritis population nor in a UK population. Hence, the aim of the study was to apply the DIF method to the SF-36 for a UK OA population. The sample comprised a community sample of 763 people with OA who participated in the Somerset and Avon Survey of Health. The SF-36 was explored for DIF with respect to demographic, social, clinical and psychological factors. Well developed ordinal regression models were used to identify DIF items. Results: DIF items were found by age (6 items), employment status (6 items), social class (2 items), mood (2 items), hip v knee (2 items), social deprivation (1 item) and body mass index (1 item). Although the impact of the DIF items rarely had a significant effect on the conclusions of group comparisons, in most cases there was a significant change in effect size. Overall, the SF-36 performed well with only a small number of DIF items identified, a reassuring finding in view of the frequent use of the SF-36 in OA. Nevertheless, where DIF items were identified it would be advisable to analyse data taking account of DIF items, especially when age effects are the focus of interest

    Exploring differential item functioning in the Western Ontario and McMaster Universities osteoarthritis index (WOMAC)

    Get PDF
    Background: The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF). That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods: The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results: After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions: Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the possible impact of the DIF item when age comparisons are of primary interest

    Observing response processes with eye tracking in international large-scale assessments: evidence from the OECD PIAAC assessment

    Get PDF
    This paper reports on a pilot study that used eye tracking techniques to make detailed observations of item response processes in the OECD Programme for the International Assessment of Adult Competencies (PIAAC). The lab-based study also recorded physiological responses using measures of pupil diameter and electrodermal activity. The study tested 14 adult respondents as they individually completed the PIAAC computer-based assessment. The eye tracking observations help to fill an ‘explanatory gap’ by providing data on variation in item response processes that are not captured by other sources of process data such as think aloud protocols or computer-generated log files. The data on fixations and saccades provided detailed information on test item response strategies, enabling profiling of respondent engagement and response processes associated with successful performance. Much of that activity does not include the use of the keyboard and mouse, and involves ‘off-screen’ use of pen and paper (and calculator) that are not captured by assessment log-files. In conclusion, this paper points toward an important application of eye tracking in large-scale assessments. This includes insights into response processes in new domains such as adaptive problem-solving that aim to identify individuals’ ability to select and combine resources from the digital and physical environment

    Comparison of the sensitivity of the UKCAT and A levels to sociodemographic characteristics: a national study

    Get PDF
    Background: The UK Clinical Aptitude Test (UKCAT) was introduced to facilitate widening participation in medical and dental education in the UK by providing universities with a continuous variable to aid selection; one that might be less sensitive to the sociodemographic background of candidates compared to traditional measures of educational attainment. Initial research suggested that males, candidates from more advantaged socioeconomic backgrounds and those who attended independent or grammar schools performed better on the test. The introduction of the A* grade at A level permits more detailed analysis of the relationship between UKCAT scores, secondary educational attainment and sociodemographic variables. Thus, our aim was to further assess whether the UKCAT is likely to add incremental value over A level (predicted or actual) attainment in the selection process. Methods: Data relating to UKCAT and A level performance from 8,180 candidates applying to medicine in 2009 who had complete information relating to six key sociodemographic variables were analysed. A series of regression analyses were conducted in order to evaluate the ability of sociodemographic status to predict performance on two outcome measures: A level ‘best of three’ tariff score; and the UKCAT scores. Results: In this sample A level attainment was independently and positively predicted by four sociodemographic variables (independent/grammar schooling, White ethnicity, age and professional social class background). These variables also independently and positively predicted UKCAT scores. There was a suggestion that UKCAT scores were less sensitive to educational background compared to A level attainment. In contrast to A level attainment, UKCAT score was independently and positively predicted by having English as a first language and male sex. Conclusions: Our findings are consistent with a previous report; most of the sociodemographic factors that predict A level attainment also predict UKCAT performance. However, compared to A levels, males and those speaking English as a first language perform better on UKCAT. Our findings suggest that UKCAT scores may be more influenced by sex and less sensitive to school type compared to A levels. These factors must be considered by institutions utilising the UKCAT as a component of the medical and dental school selection process

    Development and preliminary validation of a questionnaire to measure satisfaction with home care in Greece: an exploratory factor analysis of polychoric correlations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The primary aim of this study was to develop and psychometrically test a Greek-language instrument for measuring satisfaction with home care. The first empirical evidence about the level of satisfaction with these services in Greece is also provided.</p> <p>Methods</p> <p>The questionnaire resulted from literature search, on-site observation and cognitive interviews. It was applied in 2006 to a sample of 201 enrollees of five home care programs in the city of Thessaloniki and contains 31 items that measure satisfaction with individual service attributes and are expressed on a 5-point Likert scale. The latter has been usually considered in practice as an interval scale, although it is in principle ordinal. We thus treated the variable as an ordinal one, but also employed the traditional approach in order to compare the findings. Our analysis was therefore based on ordinal measures such as the polychoric correlation, Kendall's Tau b coefficient and ordinal Cronbach's alpha. Exploratory factor analysis was followed by an assessment of internal consistency reliability, test-retest reliability, construct validity and sensitivity.</p> <p>Results</p> <p>Analyses with ordinal and interval scale measures produced in essence very similar results and identified four multi-item scales. Three of these were found to be reliable and valid: socioeconomic change, staff skills and attitudes and service appropriateness. A fourth dimension -service planning- had lower internal consistency reliability and yet very satisfactory test-retest reliability, construct validity and floor and ceiling effects. The global satisfaction scale created was also quite reliable. Overall, participants were satisfied -yet not very satisfied- with home care services. More room for improvement seems to exist for the socio-economic and planning aspects of care and less for staff skills and attitudes and appropriateness of provided services.</p> <p>Conclusions</p> <p>The methods developed seem to be a promising tool for the measurement of home care satisfaction in Greece.</p

    Computerized adaptive testing of population psychological distress : simulation-based evaluation of GHQ-30

    Get PDF
    PURPOSE: Goldberg's General Health Questionnaire (GHQ) items are frequently used to assess psychological distress but no study to date has investigated the GHQ-30's potential for adaptive administration. In computerized adaptive testing (CAT) items are matched optimally to the targeted distress level of respondents instead of relying on fixed-length versions of instruments. We therefore calibrate GHQ-30 items and report a simulation study exploring the potential of this instrument for adaptive administration in a longitudinal setting. METHODS: GHQ-30 responses of 3445 participants with 2 completed assessments (baseline, 7-year follow-up) in the UK Health and Lifestyle Survey were calibrated using item response theory. Our simulation study evaluated the efficiency of CAT administration of the items, cross-sectionally and longitudinally, with different estimators, item selection methods, and measurement precision criteria. RESULTS: To yield accurate distress measurements (marginal reliability at least 0.90) nearly all GHQ-30 items need to be administered to most survey respondents in general population samples. When lower accuracy is permissible (marginal reliability of 0.80), adaptive administration saves approximately 2/3 of the items. For longitudinal applications, change scores based on the complete set of GHQ-30 items correlate highly with change scores from adaptive administrations. CONCLUSIONS: The rationale for CAT-GHQ-30 is only supported when the required marginal reliability is lower than 0.9, which is most likely to be the case in cross-sectional and longitudinal studies assessing mean changes in populations. Precise measurement of psychological distress at the individual level can be achieved, but requires the deployment of all 30 items
    corecore