38 research outputs found
Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT) and differential item functioning (DIF) analyses
Background: For the Low Vision Quality Of Life questionnaire (LVQOL) it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT) perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF). Methods: Cross-sectional data were used from an observational study among visually-impaired patients (n = 296). Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type) with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results: All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 ‘Vision in general’ from the “Adjustment” dimension and item 24 ‘Using tools’ from the “Reading and fine work” dimension were removed. Test information was highest for the “Reading and fine work” dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions: The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients. (aut.ref.
Unpacking the impact of chronic pain as measured by the impact stratification score
BackgroundIn 2014, the National Institute of Health Pain Consortium's research task force on research standards for chronic low back pain (CLBP) proposed a measure that could be used to stratify patients by the impact CLBP has on their lives, namely the Impact Stratification Score (ISS). This study examines the dimensionality of the ISS and support for its single total score, and evaluates its overall psychometric properties.MethodsThe sample included 1677 chiropractic patients being treated for CLBP and chronic neck pain, had an average age of 49, 71% female, and 90% White. Study participants completed the PROMIS-29 v2.1 profile survey that contains the 9 ISS items. The ISS was evaluated using item-total correlations, Cronbach's alpha, factor analysis (i.e., correlated factors and bifactor models), and item response theory (IRT). Reliability indices and item properties were evaluated from bifactor and IRT models, respectively.ResultsItem-total correlations were high (0.64-0.84) with a Cronbach's alpha of 0.93. Eigenvalues suggested the possibility of two factors corresponding to physical function and pain interference/intensity. Bifactor model results indicated that data were essentially unidimensional, primarily reflecting one general construct (i.e., impact) and that after accounting for 'impact' very little reliable variance remained in the two group factors. General impact scores were reliable (omegaH = .73). IRT models showed that items were strong indicators of impact and provided information across a wide range of the impact continuum and offer the possibility of a shorter 8-item ISS. Finally, it appears that different aspects of pain interference occur prior to losses in physical function.ConclusionsThis study presents evidence that the ISS is sufficiently unidimensional, covers a range of chronic pain impact and is a reliable measure. Insights are obtained into the sequence of chronic pain impacts on patients' lives
Recommended from our members
Analyses of Cross-Sectional Data to Link the PEG With the Patient Reported Outcomes Measurement and Information System (PROMIS) Global Physical Health Scale.
It is challenging to synthesize findings across studies of pain impact. This study develops a link to estimate the Patient-Reported Outcomes Measurement Information System (PROMIS) global health measure from the 3-item Pain intensity, interference with Enjoyment of life, interference with General activity (PEG) scale. The PROMIS and PEG items were administered to 795 adults (average age = 51; 54% female, 79% White). We estimated correlations among the PEG and PROMIS items and conducted factor analysis to identify the best subset of PROMIS items for linking to the PEG. An item response theory graded response model was estimated to link the PEG with the 4-item PROMIS global physical health scale. A categorical single-factor model and a bifactor model provided support for a single dimension for the PEG and PROMIS global physical health items. The product-moment correlation between estimated PROMIS global physical health scale from the PEG and the actual global physical health score was .74. The mean difference between estimated PROMIS global physical health scale score from the PEG and the observed global physical health score was less than a T-score point. This study makes it possible to estimate the average global physical health for group-level comparisons in research that includes the PEG. PERSPECTIVE: This article describes an empirical link of the PEG to the PROMIS global physical health scale that makes it possible to estimate the average global physical health in studies that include the PEG. This link can facilitate comparisons among studies that have not administered the PEG or the PROMIS global health scale
Recommended from our members
Mapping of the PROMIS global health measure to the PROPr in the United States
Abstract:
Background:
The Patient-Reported Outcomes Measurement and Information System (PROMIS®) global health items (global-10) yield physical and mental health scale scores and the PROMIS-Preference (PROPr) scoring system estimated from PROMIS domain scores (e.g., PROMIS-29 + 2) produces a single score anchored by 0 (dead or as bad as being dead) to 1 (full health). A link between the PROMIS global-10 and the PROPr is needed.
Methods:
The PROMIS-29 + 2 and the PROMIS global-10 were administered to 4102 adults in the Ipsos KnowledgePanel in 2022. The median age was 52 (range 18–94), 50% were female, 70% were non-Hispanic White, and 64% were married or living with a partner. The highest level of education completed for 26% of the sample was a high school degree or general education diploma and 44% worked full-time. We estimated correlations of the PROPr with the PROMIS global health items and the global physical and mental health scales. We examined the adjusted R2 and estimated correlations between predicted and observed PROPr scores.
Results:
Product-moment correlations between the PROMIS global health items and the PROPr ranged from 0.50 to 0.63. The PROMIS global physical health and mental health scale scores correlated 0.74 and 0.60, respectively, with the PROPr. The adjusted R2 in the regression of the PROPr on the PROMIS global health items was 64%. The equated PROPr preference scores correlated (product-moment) 0.80 (n = 4043; p < 0.0001) with the observed PROPr preference scores, and the intra-class correlation (two-way random effects model) was 0.80. The normalized mean absolute error (NMAE) was 0.45 (SD = 0.43). The adjusted R2 in the OLS regression of the PROPr on the PROMIS global health scales was 59%. The equated PROPr preference scores correlated (product-moment) was 0.77 (n = 4046; p < 0.0001) with the observed PROPr preference scores, and the intra-class correlation was 0.77. The NMAE was 0.49 (SD = 0.45).
Conclusions:
Regression equations provide a reasonably accurate estimate of the PROPr preference-based score from the PROMIS global health items or scales for group-level comparisons. These estimates facilitate cost-effectiveness research and meta-analyses. The estimated PROPr scores are not accurate enough for individual-level applications. Future evaluations of the prediction equations are needed
Recommended from our members
How Well Do Seven Self-Report Measures Represent Underlying Back Pain Impact?
BackgroundThe extent to which different measures of back pain impact represent an underlying common factor has implications for decisions about which one to use in studies of pain management and estimating one score from others.AimsTo determine if different self-report back pain impact measures represent an underlying pain latent variable and estimate associations with it.MethodSeven pain impact measures completed by Amazon Mechanical Turk adults are used to estimate internal consistency reliability and associations: Oswestry Disability Index (ODI), Roland-Morris Disability Questionnaire (RMDQ), short form of the Örebro Musculoskeletal Pain Questionnaire (OMPQ), Subgroups for Targeted Treatment (STarT) Back Tool, the Graded Chronic Pain Scale (GCPS) disability score, PEG (Pain intensity, interference with Enjoyment of life, interference with General activity), and Impact Stratification Score (ISS).ResultsThe sample of 1,874 adults with back pain had an average age of 41 and 52% were female. Sixteen percent were Hispanic, 7% non-Hispanic Black, 5% non-Hispanic Asian, and 71% non-Hispanic White. Internal consistency reliability estimates ranged from 0.710 (OMPQ) to 0.923 (GCPS). Correlations among the measures ranged from 0.609 (RMDQ with OMPQ) to 0.812 (PEG with GCPS). Standardized factor loadings on the pain latent variable ranged from 0.782 (RMDQ) to 0.870 (ISS).ConclusionsScores of each measure can be estimated from the others for use in research
Classifying patients with non-specific chronic low back pain using the impact stratification score in an online convenience sample
Abstract Background In 2014, the National Institute of Health Pain Consortium’s research task force (RTF) on research standards for chronic low back pain (CLBP) proposed the Impact Stratification Score (ISS) as a patient-reported outcome measure that could stratify patients by the impact CLBP has on their lives. This work compares three newly developed ISS-based classifications to the RTF’s original to provide an optimal recommendation. Methods The online sample included 1226 individuals from Amazon’s Mechanical Turk who indicated having non-specific CLBP, average age of 40, 49% female, and 67% White. Participants completed the PROMIS-29 v2.1 profile survey that contains the 9 ISS items as well the Roland-Morris Disability Questionnaire (RMDQ) and Graded Chronic Pain Scale (GCPS). Other items included high-impact chronic pain; not working due to health problems; overall health; and number of healthcare visits for back pain in the past 6 months. Three new classifications were created using quartiles (Classification 2), latent profile analysis (Classification 3), and one modeled after the GCPS (Classification 4). Classifications were subsequently compared to the RTF-proposed classification (Classification 1) on several concurrent and prognostic criteria. Results Classification 1 had three CLBP severity groups, four in Classification 2, three in Classification 3, and four in Classification 4. All novel classifications improved upon the original. Classification 2 performed best at minimizing the classification of those with negative outcomes into the lowest severity groups at baseline (e.g., 11% with RMDQ ≥ 7) and 6 months (e.g., 8.2% had fair/poor health). Classification 4 performed best at maximizing classification of those with negative outcomes into the most severe group concurrently (e.g., 100% had GCPS grade ≥ 2) and at 6 months (e.g., 100% with RMDQ ≥ 7). Conclusions We developed three ISS-based classification schemes and tested them against several outcomes. All three improved upon the original scheme. While appearing more optimal than other classifications in the lowest severity groups, Classification 2 presents some considerations and limitations. Given that Classification 4 was an improvement at the lowest end of severity and was the best at the highest end, it is our tentative recommendation that this approach be adopted to classify individuals with non-specific CLBP
Recommended from our members
Effects of Excluding Those Who Report Having “Syndomitis” or “Chekalism” on Data Quality: Longitudinal Health Survey of a Sample From Amazon’s Mechanical Turk
BackgroundResearchers have implemented multiple approaches to increase data quality from existing web-based panels such as Amazon's Mechanical Turk (MTurk).ObjectiveThis study extends prior work by examining improvements in data quality and effects on mean estimates of health status by excluding respondents who endorse 1 or both of 2 fake health conditions ("Syndomitis" and "Chekalism").MethodsSurvey data were collected in 2021 at baseline and 3 months later from MTurk study participants, aged 18 years or older, with an internet protocol address in the United States, and who had completed a minimum of 500 previous MTurk "human intelligence tasks." We included questions about demographic characteristics, health conditions (including the 2 fake conditions), and the Patient Reported Outcomes Measurement Information System (PROMIS)-29+2 (version 2.1) preference-based score survey. The 3-month follow-up survey was only administered to those who reported having back pain and did not endorse a fake condition at baseline.ResultsIn total, 15% (996/6832) of the sample endorsed at least 1 of the 2 fake conditions at baseline. Those who endorsed a fake condition at baseline were more likely to identify as male, non-White, younger, report more health conditions, and take longer to complete the survey than those who did not endorse a fake condition. They also had substantially lower internal consistency reliability on the PROMIS-29+2 scales than those who did not endorse a fake condition: physical function (0.69 vs 0.89), pain interference (0.80 vs 0.94), fatigue (0.80 vs 0.92), depression (0.78 vs 0.92), anxiety (0.78 vs 0.90), sleep disturbance (-0.27 vs 0.84), ability to participate in social roles and activities (0.77 vs 0.92), and cognitive function (0.65 vs 0.77). The lack of reliability of the sleep disturbance scale for those endorsing a fake condition was because it includes both positively and negatively worded items. Those who reported a fake condition reported significantly worse self-reported health scores (except for sleep disturbance) than those who did not endorse a fake condition. Excluding those who endorsed a fake condition improved the overall mean PROMIS-29+2 (version 2.1) T-scores by 1-2 points and the PROMIS preference-based score by 0.04. Although they did not endorse a fake condition at baseline, 6% (n=59) of them endorsed at least 1 of them on the 3-month survey and they had lower PROMIS-29+2 score internal consistency reliability and worse mean scores on the 3-month survey than those who did not report having a fake condition. Based on these results, we estimate that 25% (1708/6832) of the MTurk respondents provided careless or dishonest responses.ConclusionsThis study provides evidence that asking about fake health conditions can help to screen out respondents who may be dishonest or careless. We recommend this approach be used routinely in samples of members of MTurk