3 research outputs found

    A Reliability-Generalization Study of Journal Peer Reviews: A Multilevel Meta-Analysis of Inter-Rater Reliability and Its Determinants

    Get PDF
    Background: This paper presents the first meta-analysis for the inter-rater reliability (IRR) of journal peer reviews. IRR is defined as the extent to which two or more independent reviews of the same scientific document agree. Methodology/Principal Findings: Altogether, 70 reliability coefficients (Cohen’s Kappa, intra-class correlation [ICC], and Pearson product-moment correlation [r]) from 48 studies were taken into account in the meta-analysis. The studies were based on a total of 19,443 manuscripts; on average, each study had a sample size of 311 manuscripts (minimum: 28, maximum: 1983). The results of the meta-analysis confirmed the findings of the narrative literature reviews published to date: The level of IRR (mean ICC/r 2 =.34, mean Cohen’s Kappa =.17) was low. To explain the study-to-study variation of the IRR coefficients, meta-regression analyses were calculated using seven covariates. Two covariates that emerged in the metaregression analyses as statistically significant to gain an approximate homogeneity of the intra-class correlations indicated that, firstly, the more manuscripts that a study is based on, the smaller the reported IRR coefficients are. Secondly, if the information of the rating system for reviewers was reported in a study, then this was associated with a smaller IRR coefficient than if the information was not conveyed. Conclusions/Significance: Studies that report a high level of IRR are to be considered less credible than those with a low level o

    Validity of a structured method of selecting abstracts for a plastic surgical scientific meeting

    No full text
    In 1999, the European Association of Plastic Surgeons accepted a structured method to assess and select the abstracts that are submitted for its yearly scientific meeting. The two criteria used to evaluate whether such a selection method is accurate were reliability and validity. The authors previously established this method to be reliable on the basis of a prospective evaluation of the selection process used for the 2000 meeting of the European Association of Plastic Surgeons. It is more difficult to assess the validity of this method because there is no objective standard of quality, of a scientific abstract against which the accuracy of selection can be assessed. This study statistically evaluated the accuracy of the meeting participants' assessment of presentations made during the meeting as the criterion standard for abstract selection on the basis of data obtained from the 2002 selection process. The authors evaluated the interobserver repeatability among five meeting participants of selecting the best presentations, the validity, of the method of abstract selection after this criterion standard had been established, and the validity of reviewers' rating of abstracts as indicators of the scientific value of the actual presentations. The authors conclude that the assessment of platform presentations at a plastic surgical meeting is reliable. Accepting this assessment as the criterion standard, however, they could not prove the validity of their selection method or the validity of the reviewers' rating of abstracts as an indicator of the scientific value of the actual presentations
    corecore