Skip to main content
Article thumbnail
Location of Repository

Sources of Measurement Error in an ECG Examination: Implications for Performance-Based Assessments

By David J. Solomon Ph.D. and Gary Ferenchick MD


Objective: To assess the sources of measurement error in an electrocardiogram (ECG) interpretation examination given in a third-year internal medicine clerkship. Design: Three successive generalizability studies were conducted. 1) Multiple faculty rated student responses to a previously administered exam. 2) The rating criteria were revised and study 1 was repeated. 3) The examination was converted into an extended matching format including multiple cases with the same underlying cardiac problem. Results: The discrepancies among raters (main effects and interactions) were dwarfed by the error associated with case specificity. The largest source of the differences among raters was in rating student errors of commission rather than student errors of omission. Revisions in the rating criteria may have helped increase inter-rater reliability slightly however, due to case specificity, it had little impact on the overall reliability of the exam. The third study indicated the majority of the variability in student performance across cases was in performance across cases within the same type of cardiac problem rather than between different types of cardiac problems. Conclusions: Case specificity was the overwhelming source of measurement error. The variation among cases came mainly from discrepancies in performance between examples of the same cardiac problem rather than from differences in performance across different types of cardiac problems. This suggests it is necessary to include a large number of cases even if the goal is to assess performance on only a few types of cardiac problems

Topics: Behavioral Analysis
Year: 2003
OAI identifier:

Suggested articles


  1. (2000). Assessment of knowledge with written test forms.
  2. (1989). Factors influencing reproducibility of tests using standardized patients. Teaching and Learning in
  3. (2001). Generalizability Theory. Assessment Systems Corporation, St. Paul Mn,
  4. (1995). Performance-Based Assessments: Lessons from the Health Professions.
  5. (1991). Reliability and feasibility of measuring medical interviewing skills: the revised Maastricht history-taking and advice checklist.
  6. (1996). The emperor’s new clothes: The OSCE reassessed. Academic Medicine May;71(5):447-53.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.