72 research outputs found

    Psychometric Evaluation of Joint-Specific Patient-Reported Outcome Measures Before and After Total Knee Replacement: A Dissertation

    Get PDF
    Background: Patient reports of pain and function are used to inform the need for and timing of total knee replacement (TKR) and evaluate TKR outcomes. This dissertation compared measurement properties of commonly-used patient surveys in TKR and explored ways to develop more efficient knee-specific function measures. Methods: 1,179 FORCE-TJR patients (mean age=66.1, 61% female) completed questionnaires before and 6 months after TKR. Patient surveys included the knee-specific Knee injury and Osteoarthritis Outcome Score (KOOS) and Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) and generic SF-36 Health Survey. Tests of KOOS and WOMAC measurement properties included evaluations of scaling assumptions and reliability. Item response theory methods were used to calibrate 22 KOOS function items in one item bank; simulated computerized adaptive tests (CAT) then were used to evaluate shorter function scores customized for each patient. Validity and responsiveness of measures varying in attributes (knee-specific versus generic, longer versus shorter, CAT versus fixed-length) were compared. Results: KOOS and WOMAC scales generally met tests of scaling assumptions, although many pain items were equally strong measures of pain and physical function. Internal consistency reliability of KOOS and WOMAC scales exceeded minimum levels of 0.70 recommended for group-level comparisons across sociodemographic and clinical subgroups. Function items could be calibrated in one item bank. CAT simulations indicated that reliable knee-specific function scores could be estimated for most patients with a 55-86% reduction in respondent burden, but one-third could not achieve a reliable (≥ 0.95) CAT score post-TKR because the item bank did not include enough items vi measuring high function levels. KOOS and WOMAC scales were valid and responsive. Short function scales and CATs were as valid and responsive as longer KOOS and WOMAC function scales. The KOOS Quality of Life (QOL) scale and SF-36 Physical Component Summary discriminated best among groups evaluating themselves as improved, same or worse at 6 months. Conclusions: Results support use of the KOOS and WOMAC in TKR. Improved knee-specific function measures require new items that measure higher function levels. TKR outcomes should be evaluated with a knee-specific quality of life scale such as KOOS QOL, as well as knee-specific measures of pain and function and generic health measures

    Cultural Issues in Using the SF-36 Health Survey in Asia: Results from Taiwan

    Get PDF
    BACKGROUND: The feasibility of using the SF-36 in non-Western cultures is important for researchers seeking to understand cultural influences upon health status perceptions. This paper reports on the performance of the Taiwan version of the SF-36, including the implications of cultural influences. METHODS: A total of 1191 volunteered subjects from the general population answered the translated SF-36 Taiwan version, which was developed following IQOLA project protocols. RESULTS: Results from tests of scaling assumptions and reliability generally were satisfactory. Convergent validity, as assessed by comparing the SF-36 to a mental health oriented inventory, was acceptable. Results of principal components analysis were similar to US results for many scales. However, differences were seen for the Vitality scale which was a stronger measure of mental health than physical health in Taiwan. Results are compared to those from other Asian studies and the U.S. CONCLUSION: The results raise important questions regarding cultural influences in international studies of health status assessment. Further research into the conceptualization and components of mental health in Asian countries is warranted

    Standardizing disease-specific quality of life measures across multiple chronic conditions: development and initial evaluation of the QOL Disease Impact Scale (QDIS(R))

    Get PDF
    BACKGROUND: To document the development and evaluation of the Quality of life Disease Impact Scale (QDIS(R)), a measure that standardizes item content and scoring across chronic conditions and provides a summary, norm-based QOL impact score for each disease. METHODS: A bank of 49 disease impact items was constructed from previously-used descriptions of health impact to represent ten frequently-measured quality of life (QOL) content areas and operational definitions successfully utilized in generic QOL surveys. In contrast to health in general, all items were administered with attribution to a specific disease (osteoarthritis, rheumatoid arthritis, angina, myocardial infarction, congestive heart failure, chronic kidney disease (CKD), diabetes, asthma, or COPD). Responses from 5418 adults were analyzed as five disease groups: arthritis, cardiovascular, CKD, diabetes, and respiratory. Unidimensionality, item parameter and scale-level invariance, reliability, validity and responsiveness to change during 9-month follow-up were evaluated by disease group and for all groups combined using multi-group confirmatory factor analysis (MGCFA), item response theory (IRT) and analysis of variance methods. QDIS was normed in an independent chronically ill US population sample (N = 4120). RESULTS: MGCFA confirmed a 1-factor model, justifying a summary score estimated using equal parameters for each item across disease groups. In support of standardized IRT-based scoring, correlations were very high between disease-specific and standardized IRT item slopes (r = 0.88-0.96), thresholds (r = 0.93-0.99) and person-level scores (r \u3e /= 0.99). Internal consistency, test-retest and person-level IRT reliability were consistently satisfactory across groups. In support of interpreting QDIS as a disease-specific measure, in comparison with generic measures, QDIS consistently discriminated markedly better across disease severity levels, correlated higher with other disease-specific measures in cross-sectional tests, and was more responsive in comparisons of groups with better, same or worse evaluations of disease-specific outcomes at the 9-month follow-up. CONCLUSIONS: Standardization of content and scoring across diseases was shown to be justified psychometrically and enabled the first summary measure of disease-specific QOL impact normed in the chronically ill population. This disease-specific approach substantially improves discriminant validity and responsiveness over generic measures and provides a basis for better understanding the relative QOL impact of multiple chronic conditions in research and clinical practice

    Evaluation of smoking-specific and generic quality of life measures in current and former smokers in Germany and the United States

    Get PDF
    BACKGROUND: Health-related quality of life (QOL) surveys include generic measures that enable comparisons across conditions and measures that focus more specifically on one disease or condition. We evaluated the psychometric properties of German- and English-language versions of survey scales representing both types of measures in samples of current and former smokers. METHODS: TQOLITv1 integrates new measures of smoking-specific symptoms and QOL impact attributed to smoking with generic SF-36 Health Survey measures. For purposes of evaluation, cross-sectional data were analyzed for two independent samples. Disease-free (otherwise healthy) adults ages 23-55 used a tablet to complete surveys in a clinical trial in Germany (125 current and 54 former smokers). Online general population surveys were completed in the US by otherwise healthy current and former smokers (N = 149 and 110, respectively). Evaluations included psychometric tests of assumptions underlying scale construction and scoring, score distributions, and reliability. Tests of validity included cross-sectional correlations and analyses of variance based on a conceptual framework and hypotheses for groups differing in self-reported smoking behavior (current versus former smoker, cigarettes per day (CPD)) and severity of smoking symptoms in both samples and, in the German trial only, clinical parameters of biomarkers of exposure. RESULTS: Tests of scaling assumptions and internal consistency reliability (alpha = 0.71-0.79) of the smoking-specific measures were satisfactory, although ceiling effects attenuated correlations for former smokers in both samples. Correlational evidence supporting validity of smoking-specific symptom and impact measures included their substantial inter-correlation and higher correlations (than generic measures) with smoking behavior (favoring former over current groups) and CPD in both samples. In the German trial, both smoking-specific measures correlated significantly (p \u3c 0.05) with all four biomarkers. QOL impact attributed to smoking correlated with the SF-36 mental but not physical summary measures in both samples. CONCLUSIONS: German- and English-language TQOLITv1 surveys have comparable and satisfactory psychometric properties. Cross-sectional tests, including correlations with four biomarkers, support the validity of the new smoking-specific measures for use in studies of otherwise healthy smokers. Smoking-specific measures consistently performed better than generic QOL measures in all tests of validity

    Psychometric evaluation of the SF-36 health survey in Medicare managed care

    Get PDF
    Data quality and scoring assumptions for the SF-36 Health Survey were evaluated among the elderly and disabled, using 1998 Cohort I baseline Medicare HOS data (n=177,714). Missing data rates were low, and scoring assumptions were met. Internal consistency reliability was 0.83 to 0.93 for the eight scales and 0.94 and 0.89, respectively, for the physical (PCS) and mental (MCS) component summary measures. Results declined with increased risk factors (e.g., older age, more chronic conditions), but were well above accepted standards for all subgroups. These findings support using standard algorithms for scoring the SF-36 in the HOS and subgroup analyses of HOS data

    Confirmatory factor analysis of the thyroid-related quality of life questionnaire ThyPRO

    Get PDF
    Background and aim. Thyroid diseases are prevalent and chronic. With treatment, quality of life is restored in most, but not all patients. Construct validity of the thyroid-related quality of life questionnaire, ThyPRO, has been established by multi-trait scaling, but not evaluated with more elaborate methods. The purpose of the present study was to evaluate dimensionality of the ThyPRO scales and to attempt to understand possible item misfit through structural equation modeling for categorical data. Methods. The current 84-item version of ThyPRO consists of 13 scales, covering domains of physical (4 scales) and mental (2 scales) symptoms, function and well-being (3 scales) and participation/social function (4 scales). The data were collected from a cross-sectional sample of 907 thyroid patients. One-factor confirmatory models were fitted to each scale, and evaluated by model fit statistics (comparative fit index \u3e 0.95, root mean square error of approximation \u3c 0.08), magnitude of factor loadings, model residual correlations and modification indices (MI). Indications of multi-dimensionality were tested in bi-factor models. Possible item misfit was evaluated in a combined, investigational model. Results. Each ThyPRO scale was adequately represented by a unidimensional model after minor revisions. Eleven items were identified in the unidimensional models as potentially misfitting and were investigated further by multidimensional modeling. Conclusion. Elaborate psychometric modeling supported the construct validity of the ThyPRO. However, 11 potentially misfitting items and 18 items with local dependence to other items are candidates for removal in future item reduction processes

    Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function

    Get PDF
    BACKGROUND: Physical function (PF) is a core patient-reported outcome domain in clinical trials in rheumatic diseases. Frequently used PF measures have ceiling effects, leading to large sample size requirements and low sensitivity to change. In most of these instruments, the response category that indicates the highest PF level is the statement that one is able to perform a given physical activity without any limitations or difficulty. This study investigates whether using an item format with an extended response scale, allowing respondents to state that the performance of an activity is easy or very easy, increases the range of precise measurement of self-reported PF. METHODS: Three five-item PF short forms were constructed from the Patient-Reported Outcomes Measurement Information System (PROMIS(R)) wave 1 data. All forms included the same physical activities but varied in item stem and response scale: format A ( Are you able to ... ; without any difficulty / unable to do ); format B ( Does your health now limit you ... ; not at all / cannot do ); format C ( How difficult is it for you to ... ; very easy / impossible ). Each short-form item was answered by 2217-2835 subjects. We evaluated unidimensionality and estimated a graded response model for the 15 short-form items and remaining 119 items of the PROMIS PF bank to compare item and test information for the short forms along the PF continuum. We then used simulated data for five groups with different PF levels to illustrate differences in scoring precision between the short forms using different item formats. RESULTS: Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side of the PF continuum of the sample, provided more item information, and was more useful in distinguishing known groups with above-average functioning. CONCLUSIONS: Using an item format with an extended response scale is an efficient option to increase the measurement range of self-reported physical function without changing the content of the measure or affecting the latent construct of the instrument

    Survivors of an Acute Coronary Syndrome with Lower Patient Activation Are More Likely to Experience Declines in Health-Related Quality of Life

    Get PDF
    Background: Patient activation comprises the knowledge, skills, and confidence for self-care, and may lead to better health outcomes. Objectives: We examined the relationship between patient activation and changes in health-related quality of life (HRQOL) following hospitalization for an acute coronary syndrome (ACS). Methods: We studied patients from 6 medical centers in central Massachusetts and Georgia who had been hospitalized for an ACS between 2011 and 2013. At 1 month after hospital discharge, patients completed the 6-item Patient Activation Measure and were categorized into 4 levels of activation. Multinomial logistic regression analyses compared activation level with clinically meaningful changes (≥ 3.0 points generic, ≥10.0 points disease-specific) in generic physical (SF-36 PCS), generic mental (SF-36 MCS), and disease-specific (Seattle Angina Questionnaire, SAQ) HRQOL from 1 to 3 and 1 to 6 months after hospitalization, adjusting for potential sociodemographic and clinical confounders. Results: Patients (n=1,042) were on average 62 years old, 34% female, and 87% non-Hispanic white. Overall, 10% were in the lowest level of activation. Patients with the lowest activation had 1.95 (95% CI: 1.05, 3.62) and 2.18 (95% CI: 1.17, 4.05) times the odds of experiencing clinically significant declines in MCS and SAQ QOL scores, respectively, between 1 and 6 months than the most activated patients. Patient activation level was not associated with meaningful changes in PCS scores. Conclusions: Hospital survivors of an ACS with lower activation may be more likely to experience declines in mental and disease-specific HRQOL than more activated patients, identifying a group at risk of poor outcomes

    Better assessment of physical function: item improvement is neglected but essential

    Get PDF
    INTRODUCTION: Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. METHODS: The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36\u27s PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. RESULTS: We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were \u3e 0.90. CONCLUSIONS: Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes
    corecore