334 research outputs found

    Reproducibility of goniometric measurement of the knee in the in-hospital phase following total knee arthroplasty

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The objective of the present study was to assess interobserver reproducibility (in terms of reliability and agreement) of active and passive measurements of knee RoM using a long arm goniometer, performed by trained physical therapists in a clinical setting in total knee arthroplasty patients, within the first four days after surgery.</p> <p>Methods</p> <p>Test-retest analysis</p> <p>Setting: University hospital departments of orthopaedics and physical therapy</p> <p>Participants: Two experienced physical therapists assessed 30 patients, three days after total knee arthroplasty.</p> <p>Main outcome measure: RoM measurement using a long-arm (50 cm) goniometer</p> <p>Agreement was calculated as the mean difference between observers ± 95% CI of this mean difference. The intraclass correlation coefficient (ICC) was calculated as a measure of reliability, based on two-way random effects analysis of variance.</p> <p>Results</p> <p>The lowest level of agreement was that for measurement of passive flexion with the patient in supine position (mean difference 1.4°; limits of agreement 16.2° to 19° for the difference between the two observers. The highest levels of agreement were found for measurement of passive flexion with the patient in sitting position and for measurement of passive extension (mean difference 2.7°; limits of agreement -6.7 to 12.1 and mean difference 2.2°; limits of agreement -6.2 to 10.6 degrees, respectively). The ability to differentiate between subjects ranged from 0.62 for measurement of passive extension to 0.89 for measurements of active flexion (ICC values).</p> <p>Conclusion</p> <p>Interobserver agreement for flexion as well as extension was only fair. When two different observers assess the same patients in the acute phase after total knee arthroplasty using a long arm goniometer, differences in RoM of less than eight degrees cannot be distinguished from measurement error. Reliability was found to be acceptable for comparison on group level, but poor for individual comparisons over time.</p

    Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist

    Get PDF
    Background: The COSMIN checklist is a standardized tool for assessing the methodological quality of studies on measurement properties. It contains 9 boxes, each dealing with one measurement property, with 5-18 items per box about design aspects and statistical methods. Our aim was to develop a scoring system for the COSMIN checklist to calculate quality scores per measurement property when using the checklist in systematic reviews of measurement properties. Methods: The scoring system was developed based on discussions among experts and testing of the scoring system on 46 articles from a systematic review. Four response options were defined for each COSMIN item (excellent, good, fair, and poor). A quality score per measurement property is obtained by taking the lowest rating of any item in a box ("worst score counts"). Results: Specific criteria for excellent, good, fair, and poor quality for each COSMIN item are described. In defining the criteria, the "worst score counts" algorithm was taken into consideration. This means that only fatal flaws were defined as poor quality. The scores of the 46 articles show how the scoring system can be used to provide an overview of the methodological quality of studies included in a systematic review of measurement properties. Conclusions: Based on experience in testing this scoring system on 46 articles, the COSMIN checklist with the proposed scoring system seems to be a useful tool for assessing the methodological quality of studies included in systematic reviews of measurement properties. © The Author(s) 2011

    A multivariate hierarchical Bayesian approach to measuring agreement in repeated measurement method comparison studies

    Get PDF
    Background. Assessing agreement in method comparison studies depends on two fundamentally important components; validity (the between method agreement) and reproducibility (the within method agreement). The Bland-Altman limits of agreement technique is one of the favoured approaches in medical literature for assessing between method validity. However, few researchers have adopted this approach for the assessment of both validity and reproducibility. This may be partly due to a lack of a flexible, easily implemented and readily available statistical machinery to analyse repeated measurement method comparison data. Methods. Adopting the Bland-Altman framework, but using Bayesian methods, we present this statistical machinery. Two multivariate hierarchical Bayesian models are advocated, one which assumes that the underlying values for subjects remain static (exchangeable replicates) and one which assumes that the underlying values can change between repeated measurements (non-exchangeable replicates). Results. We illustrate the salient advantages of these models using two separate datasets that have been previously analysed and presented; (i) assuming static underlying values analysed using both multivariate hierarchical Bayesian models, and (ii) assuming each subject's underlying value is continually changing quantity and analysed using the non-exchangeable replicate multivariate hierarchical Bayesian model. Conclusion. These easily implemented models allow for full parameter uncertainty, simultaneous method comparison, handle unbalanced or missing data, and provide estimates and credible regions for all the parameters of interest. Computer code for the analyses in also presented, provided in the freely available and currently cost free software package WinBUGS

    The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study

    Get PDF
    BACKGROUND: Aim of the COSMIN study (COnsensus-based Standards for the selection of health status Measurement INstruments) was to develop a consensus-based checklist to evaluate the methodological quality of studies on measurement properties. We present the COSMIN checklist and the agreement of the panel on the items of the checklist. METHODS: A four-round Delphi study was performed with international experts (psychologists, epidemiologists, statisticians and clinicians). Of the 91 invited experts, 57 agreed to participate (63%). Panel members were asked to rate their (dis)agreement with each proposal on a five-point scale. Consensus was considered to be reached when at least 67% of the panel members indicated 'agree' or 'strongly agree'. RESULTS: Consensus was reached on the inclusion of the following measurement properties: internal consistency, reliability, measurement error, content validity (including face validity), construct validity (including structural validity, hypotheses testing and cross-cultural validity), criterion validity, responsiveness, and interpretability. The latter was not considered a measurement property. The panel also reached consensus on how these properties should be assessed. CONCLUSIONS: The resulting COSMIN checklist could be useful when selecting a measurement instrument, peer-reviewing a manuscript, designing or reporting a study on measurement properties, or for educational purposes.This study was financially supported by the EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, and the Anna Foundation, Leiden, The Netherlands

    Evaluation of the measurement properties of the Manchester foot pain and disability index

    Get PDF
    BACKGROUND: The Manchester Foot Pain and Disability Index (MFPDI, 19 items) was developed to measure functional limitations, pain and appearance for patients with foot pain and is frequently used in both observational studies and randomised controlled trials. A Dutch version of the MFPDI was developed. The aims of this study were to evaluate all the measurement properties for the Dutch version of the MFPDI and to evaluate comparability to the original version. METHOD: The MFPDI was translated into Dutch using a forward/backward translation process. The dimensionality was evaluated using exploratory and confirmatory factor analysis. Measurement properties were evaluated per subscale according to the COSMIN taxonomy consisting of: reliability (internal consistency, test-retest reliability and measurement error), validity (structural validity, content validity and cross-cultural validity comparing the Dutch version to the English version) responsiveness and interpretation. RESULTS: The questionnaire consists of three scales, measuring foot function, foot pain and perception. The reliability of the foot function scale is acceptable (Cronbach’s α > 0.7, ICC = 0.7, SEM = 2.2 on 0-18 scale). The construct validity of the function and pain scale was confirmed and only the pain scale contains one item with differential item functioning (DIF). The responsiveness of the function and pain scale is moderate when compared to anchor questions. CONCLUSION: Results using the Dutch MFPDI version can be compared to results using the original version. The foot function sub-scale (items 1-9) is a reliable and valid sub-scale. This study indicates that the use of the MFPDI as a longitudinal instrument might be problematic for measuring change in musculoskeletal foot pain due to moderate responsiveness

    The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: A clarification of its content

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The COSMIN checklist (COnsensus-based Standards for the selection of health status Measurement INstruments) was developed in an international Delphi study to evaluate the methodological quality of studies on measurement properties of health-related patient reported outcomes (HR-PROs). In this paper, we explain our choices for the design requirements and preferred statistical methods for which no evidence is available in the literature or on which the Delphi panel members had substantial discussion.</p> <p>Methods</p> <p>The issues described in this paper are a reflection of the Delphi process in which 43 panel members participated.</p> <p>Results</p> <p>The topics discussed are internal consistency (relevance for reflective and formative models, and distinction with unidimensionality), content validity (judging relevance and comprehensiveness), hypotheses testing as an aspect of construct validity (specificity of hypotheses), criterion validity (relevance for PROs), and responsiveness (concept and relation to validity, and (in) appropriate measures).</p> <p>Conclusions</p> <p>We expect that this paper will contribute to a better understanding of the rationale behind the items, thereby enhancing the acceptance and use of the COSMIN checklist.</p

    The size of the treatment effect: do patients and proxies agree?

    Get PDF
    Background: This study examined whether MS patients and proxy respondents agreed on change in disease impact, which was induced by treatment. This may be of interest in situations when patients suffer from limitations that interfere with reliable self-assessment, such as cognitive impairment.Methods: MS patients and proxies completed the Multiple Sclerosis Impact Scale (MSIS-29) before and after intravenous steroid treatment. Analyses focused on patient-proxy agreement between MSIS-29 change scores. Transition ratings were used to measure the patient's judgement of change and whether this change was reflected in the MSIS-29 change of patients and proxies. Receiver operating characteristic (ROC) analyses were also performed to examine the diagnostic properties of the MSIS-29 when completed by patients and proxies.Results: 42 patients and proxy respondents completed the MSIS-29 at baseline and follow-up. Patient-proxy differences between change scores on the physical and psychological MSIS-29 subscale were quite small, although large variability was found. The direction of mean change was in concordance with the transition ratings of the patients. Results of the ROC analyses of the MSIS-29 were similar when completed by patients (physical scale: AUC = 0.79, 95% CI: 0.65 - 0.93 and 0.66, 95% CI: 0.48 - 0.84 for the psychological scale) and proxies (physical scale: 0.80, 95% CI: 0.72 - 0.96 and 0.71, 95% CI: 0.56 - 0.87 for the psychological scale)Conclusion: Although the results need to be further explored in larger samples, these results do point towards possible use of proxy respondents to assess patient perceived treatment change at the group level

    Reproducibility and responsiveness of the Symptom Severity Scale and the hand and finger function subscale of the Dutch arthritis impact measurement scales (Dutch-AIMS2-HFF) in primary care patients with wrist or hand problems

    Get PDF
    BACKGROUND: To determine the clinimetric properties of two questionnaires assessing symptoms (Symptom Severity Scale) and physical functioning (hand and finger function subscale of the AIMS2) in a Dutch primary care population. METHODS: The first 84 participants in a 1-year follow-up study on the diagnosis and prognosis of hand and wrist problems completed the Symptom Severity Scale and the hand and finger function subscale of the Dutch-AIMS2 twice within 1 to 2 weeks. The data were used to assess test-retest reliability (ICC) and smallest detectable change (SDC, based on the standard error of measurement (SEM)). To assess responsiveness, changes in scores between baseline and the 3 month follow-up were related to an external criterion to estimate the minimal important change (MIC). We calculated the group size needed to detect the MIC beyond measurement error. RESULTS: The ICC for the Symptom Severity Scale was 0.68 (95% CI: 0.54–0.78). The SDC was 1.00 at individual level and 0.11 at group level, both on a 5-point scale. The MIC was 0.23, exceeding the SDC at group level. The group size required to detect a MIC beyond measurement error was 19 for the Symptom Severity Scale. The ICC for the hand and finger function subscale of the Dutch-AIMS2 was 0.62 (95% CI: 0.47–0.74). The SDC was 3.80 at individual level and 0.42 at group level, both on an 11-point scale. The MIC was 0.31, which was less than the SDC at group level. The group size required to detect a MIC beyond measurement error was 150. CONCLUSION: In our heterogeneous primary care population the Symptom Severity Scale was found to be a suitable instrument to assess the severity of symptoms, whereas the hand and finger function subscale of the Dutch-AIMS2 was less suitable for the measurement of physical functioning in patients with hand and wrist problems

    Factorial validity and internal consistency of the PRAFAB questionnaire in women with stress urinary incontinence

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To investigate the factor structure, dimensionality and construct validity of the (5-item) PRAFAB questionnaire score in women with stress urinary incontinence (stress UI).</p> <p>Methods</p> <p>A cross validation study design was used in a cohort of 279 patients who were randomly divided into Sample A or B. Sample A was used for preliminary exploratory factor analyses with promax rotation. Sample B provided an independent sample for confirming the premeditated and proposed factor structure and item retention. Internal consistency, item-total and subscale correlations were determined to assess the dimensionality. Construct validity was assessed by comparing factor-based scale means by clinical characteristics based on known relationships.</p> <p>Results</p> <p>Factor analyses resulted in a two-factor structure or subscales: items related to 'leakage severity' (protection, amount and frequency) and items related to its 'perceived symptom impact' or consequences of stress UI on the patient's life (adjustment and body (or self) image). The patterns of the factor loadings were fairly identical for both study samples. The two constructed subscales demonstrated adequate internal consistency with Cronbach's alphas in a range of 0.78 and 0.84 respectively. Scale scores differed by clinical characteristics according to the expectations and supported the construct validity of the scales.</p> <p>Conclusion</p> <p>The findings suggest a two-factorial structure of the PRAFAB questionnaire. Furthermore the results confirmed the internal consistency and construct validity as demonstrated in our previous study. The best description of the factorial structure of the PRAFAB questionnaire was given by a two-factor solution, measuring the stress UI leakage severity items and the perceived symptom impact items. Future research will be necessary to replicate these findings in different settings, type of UI and non-white women and men.</p

    Inter-rater agreement and reliability of the COSMIN (COnsensus-based Standards for the selection of health status Measurement Instruments) Checklist

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114).</p> <p>Methods</p> <p>75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item.</p> <p>Results</p> <p>88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68% was above 80% agreement), and the kappa coefficients for the COSMIN items were low (61% was below 0.40, 6% was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions.</p> <p>Conclusions</p> <p>Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved.</p
    corecore