Search CORE

214 research outputs found

Standard setting: Comparison of two methods

Author: A Kramer
BH Verhoeven
BH Verhoeven
DB Wayne
DC Howell
DM Kaufman
Femi Oyebode
G Hurtz
G Talente
GJ Cizek
GV Glass
J Searle
JC Impara
JC Impara
JC Impara
JJ Norcini
JJ Norcini
JJ Norcini
K Boursicot
M Cusimano
M Kane
M Sayeed Haque
MD Reckase
MJ Zieky
ML Fehrmann
National Research Council
P Armitage
PR Brandon
S Humphry-Murto
S Kilminster
Sanju George
SM Case
SM Downing
WA Angoff
Publication venue: BioMed Central
Publication date: 14/09/2006
Field of study

BACKGROUND: The outcome of assessments is determined by the standard-setting method used. There is a wide range of standard – setting methods and the two used most extensively in undergraduate medical education in the UK are the norm-reference and the criterion-reference methods. The aims of the study were to compare these two standard-setting methods for a multiple-choice question examination and to estimate the test-retest and inter-rater reliability of the modified Angoff method. METHODS: The norm – reference method of standard -setting (mean minus 1 SD) was applied to the 'raw' scores of 78 4th-year medical students on a multiple-choice examination (MCQ). Two panels of raters also set the standard using the modified Angoff method for the same multiple-choice question paper on two occasions (6 months apart). We compared the pass/fail rates derived from the norm reference and the Angoff methods and also assessed the test-retest and inter-rater reliability of the modified Angoff method. RESULTS: The pass rate with the norm-reference method was 85% (66/78) and that by the Angoff method was 100% (78 out of 78). The percentage agreement between Angoff method and norm-reference was 78% (95% CI 69% – 87%). The modified Angoff method had an inter-rater reliability of 0.81 – 0.82 and a test-retest reliability of 0.59–0.74. CONCLUSION: There were significant differences in the outcomes of these two standard-setting methods, as shown by the difference in the proportion of candidates that passed and failed the assessment. The modified Angoff method was found to have good inter-rater reliability and moderate test-retest reliability

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

PubMed Central

Changes in standard of candidates taking the MRCP(UK) Part 1 examination, 1985 to 2002: Analysis of marker questions

Author: Anonymous
DNM De Gruijter
EE Ghiselli
General Medical Council
General Medical Council
General Medical Council
HA David
IC McManus
IC McManus
IC McManus
IC McManus
J McHarg
J Mollon
JA Vale
JJ Norcini
MJ Kolen
OL Duke
PP McKeown
RD Bock
Royal Commission
SA Livingston
WH Angoff
WKB Hofstee
Publication venue
Publication date: 01/01/2005
Field of study

The maintenance of standards is a problem for postgraduate medical examinations, particularly if they use norm-referencing as the sole method of standard setting. In each of its diets, the MRCP(UK) Part 1 Examination includes a number of marker questions, which are unchanged from their use in a previous diet. This paper describes two complementary studies of marker questions for 52 diets of the MRCP(UK) Part 1 Examination over the years 1985 to 2001 to assess whether standards have changed

Crossref

Springer - Publisher Connector

UCL Discovery

PubMed Central

The reporting of statistics in medical educational studies: an observational study

Author: AE Dusoir
D Anthony
D Moher
DG Altman
DG Altman
G Piaggio
GA Warshaw
GR Williamson
JJ Norcini
M Oakes
MS Wilkes
Norman A Desbiens
RR Sokal
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background There is confusion in the medical literature as to whether statistics should be reported in survey studies that query an entire population, as is often done in educational studies. Our objective was to determine how often statistical tests have been reported in such articles in two prominent journals that publish these types of studies. Methods For this observational study, we used electronic searching to identify all survey studies published in <it>Academic Medicine </it>and the <it>Journal of General Internal Medicine </it>in which an entire population was studied. We tallied whether inferential statistics were used and whether p-values were reported. Results Eighty-four articles were found: 62 in <it>Academic Medicine </it>and 22 in the <it>Journal of General Internal Medicine</it>. Overall, 38 (45%) of the articles reported or stated that they calculated statistics: 35% in <it>Academic Medicine </it>and 73% in the <it>Journal of General Internal Medicine</it>. Conclusion Educational enumeration surveys frequently report statistical tests. Until a better case can be made for doing so, a simple rule can be proffered to researchers. When studying an entire population (e.g., all program directors, all deans, and all medical schools) for factual information, do not perform statistical tests. Reporting percentages is sufficient and proper.</p

Crossref

Directory of Open Access Journals

PubMed Central

How to set the bar in competency-based medical education: standard setting after an Objective Structured Clinical Examination (OSCE)

Author: A Kramer
BA Alman
Brian Hodges
Charlotte Ringsted
D Newble
Darrell Ogilvie-Harris
David Wasserstein
DM Kaufman
Jaskarndip Chahal
JJ Norcini
JJ Norcini
JJ Norcini
JJ Norcini Jr
JM Turnbull
John Theodoropoulos
KA Boursicot
Kulamakan Mahan Kulasegaram
M Schoonheim-Klein
MD Cusimano
MM Verheggen
PC Ferguson
RG Williams
RL Brennan
S Kilminster
S Taber
Sarah Wright
SM Downing
SM Downing
SM Hejri
Tim Dwyer
TJ Wilkinson
TJ Wood
W Wrigley
WF Iobst
WH Angoff
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

In-training assessment using direct observation of single-patient encounters: a literature review

Author: A Alves de Lima
A. W. M. Kramer
AA Donato
BR Nair
C Dowson
C Ringsted
C. P. M. van der Vleuten
CI Anderson
CPM Vleuten van der
CPM Vleuten van der
DA Cook
DM Torre
E. A. M. Pelgrim
ES Holmboe
H. G. A. Mokkink
J Norcini
J Turnbull
JJ Norcini
JJ Norcini
JJ Norcini
JL Lane
JL Paukert
JR Kogan
JR Kogan
JR Kogan
JR Kogan
JR Wilkinson
KC Golnik
KC Golnik
KC Nyman
L Prescott-Clements
L. van den Elsen
LE Prescott
MJ Margolis
ML Richards
P Shayne
P Shayne
PK Han
PS Links
R Cruess
R Hatala
R Hatala
R Ross
R. P. T. M. Grol
S Malhotra
SJ Durning
V Wass
V Wass
VC Burch
Publication venue: Springer Netherlands
Publication date: 01/01/2010
Field of study

We reviewed the literature on instruments for work-based assessment in single clinical encounters, such as the mini-clinical evaluation exercise (mini-CEX), and examined differences between these instruments in characteristics and feasibility, reliability, validity and educational effect. A PubMed search of the literature published before 8 January 2009 yielded 39 articles dealing with 18 different assessment instruments. One researcher extracted data on the characteristics of the instruments and two researchers extracted data on feasibility, reliability, validity and educational effect. Instruments are predominantly formative. Feasibility is generally deemed good and assessor training occurs sparsely but is considered crucial for successful implementation. Acceptable reliability can be achieved with 10 encounters. The validity of many instruments is not investigated, but the validity of the mini-CEX and the ‘clinical evaluation exercise’ is supported by strong and significant correlations with other valid assessment instruments. The evidence from the few studies on educational effects is not very convincing. The reports on clinical assessment instruments for single work-based encounters are generally positive, but supporting evidence is sparse. Feasibility of instruments seems to be good and reliability requires a minimum of 10 encounters, but no clear conclusions emerge on other aspects. Studies on assessor and learner training and studies examining effects beyond ‘happiness data’ are badly needed

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

Radboud Repository

The reliability of in-training assessment when performance improvement is taken into account

Author: A Alves de Lima
A Laenen
A. N. Raat
B Issenberg
C Vleuten Van der
CPM Vleuten Van der
CPM Vleuten Van der
D Newble
DJ Murphy
Elisabeth A. van Hell
H Davies
HEM Daelmans
J Crossley
J Norcini
J Rasbash
J Turnbull
J Turnbull
J Turnbull
Jan B. M. Kuks
Janke Cohen-Schotanus
JJ Norcini
JJ Norcini
JM Shumway
JR Kogan
JR Wilkinson
L Prescott-Clements
LWT Schuwirth
Mirjam T. van Lohuizen
MJB Govaerts
N Fernando
PF Wimmers
R Brennan
RG Williams
Roy E. Stewart
SM Downing
TAB Snijders
V Wass
V Wass
Publication venue: Springer Netherlands
Publication date: 01/01/2010
Field of study

During in-training assessment students are frequently assessed over a longer period of time and therefore it can be expected that their performance will improve. We studied whether there really is a measurable performance improvement when students are assessed over an extended period of time and how this improvement affects the reliability of the overall judgement. In-training assessment results were obtained from 104 students on rotation at our university hospital or at one of the six affiliated hospitals. Generalisability theory was used in combination with multilevel analysis to obtain reliability coefficients and to estimate the number of assessments needed for reliable overall judgement, both including and excluding performance improvement. Students’ clinical performance ratings improved significantly from a mean of 7.6 at the start to a mean of 7.8 at the end of their clerkship. When taking performance improvement into account, reliability coefficients were higher. The number of assessments needed to achieve a reliability of 0.80 or higher decreased from 17 to 11. Therefore, when studying reliability of in-training assessment, performance improvement should be considered

Crossref

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

PubMed Central

University of Groningen Digital Archive

Dissertations of the University of Groningen

Competency-based evaluation tools for integrative medicine training in family medicine residency: a pilot study

Author: B Kligler
B Kligler
Benjamin Kligler
Craig Schneider
Institute of Medicine Committee on the Use of Complementary and Alternative Medicine
JJ Norcini
Mary Koithan
Meg Hayes
Patricia Lebensohn
Susan Hadley
V Maizes
V Maizes
Victoria Maizes
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: As more integrative medicine educational content is integrated into conventional family medicine teaching, the need for effective evaluation strategies grows. Through the Integrative Family Medicine program, a six site pilot program of a four year residency training model combining integrative medicine and family medicine training, we have developed and tested a set of competency-based evaluation tools to assess residents' skills in integrative medicine history-taking and treatment planning. This paper presents the results from the implementation of direct observation and treatment plan evaluation tools, as well as the results of two Objective Structured Clinical Examinations (OSCEs) developed for the program. METHODS: The direct observation (DO) and treatment plan (TP) evaluation tools developed for the IFM program were implemented by faculty at each of the six sites during the PGY-4 year (n = 11 on DO and n = 8 on TP). The OSCE I was implemented first in 2005 (n = 6), revised and then implemented with a second class of IFM participants in 2006 (n = 7). OSCE II was implemented in fall 2005 with only one class of IFM participants (n = 6). Data from the initial implementation of these tools are described using descriptive statistics. RESULTS: Results from the implementation of these tools at the IFM sites suggest that we need more emphasis in our curriculum on incorporating spirituality into history-taking and treatment planning, and more training for IFM residents on effective assessment of readiness for change and strategies for delivering integrative medicine treatment recommendations. Focusing our OSCE assessment more narrowly on integrative medicine history-taking skills was much more effective in delineating strengths and weaknesses in our residents' performance than using the OSCE for both integrative and more basic communication competencies. CONCLUSION: As these tools are refined further they will be of value both in improving our teaching in the IFM program and as competency-based evaluation resources for the expanding number of family medicine residency programs incorporating integrative medicine into their curriculum. The next stages of work on these instruments will involve establishing inter-rater reliability and defining more clearly the specific behaviors which we believe establish competency in the integrative medicine skills defined for the program

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The University of Arizona

Relationship Between Peer Assessment During Medical School, Dean’s Letter Rankings, and Ratings by Internship Directors

Author: Anne C. Nofziger
CH Toewe 2nd
David R. Lambert
DD Hunt
DD Hunt
DE Kern
EF Dannefer
H Loeser
J Yager
JJ Norcini
JJ Veloski
JL Provan
JM DuBois
L Arnold
LI Leiden
M Edmond
M Papadakis
NL Dudek
PO Ozuah
Ronald M. Epstein
Stephen J. Lurie
Tana A. Grady-Weliky
Publication venue: Springer-Verlag
Publication date: 01/01/2007
Field of study

BACKGROUND: It is not known to what extent the dean’s letter (medical student performance evaluation [MSPE]) reflects peer-assessed work habits (WH) skills and/or interpersonal attributes (IA) of students. OBJECTIVE: To compare peer ratings of WH and IA of second- and third-year medical students with later MSPE rankings and ratings by internship program directors. DESIGN AND PARTICIPANTS: Participants were 281 medical students from the classes of 2004, 2005, and 2006 at a private medical school in the northeastern United States, who had participated in peer assessment exercises in the second and third years of medical school. For students from the class of 2004, we also compared peer assessment data against later evaluations obtained from internship program directors. RESULTS: Peer-assessed WH were predictive of later MSPE groups in both the second (F = 44.90, P < .001) and third years (F = 29.54, P < .001) of medical school. Interpersonal attributes were not related to MSPE rankings in either year. MSPE rankings for a majority of students were predictable from peer-assessed WH scores. Internship directors’ ratings were significantly related to second- and third-year peer-assessed WH scores (r = .32 [P = .15] and r = .43 [P = .004]), respectively, but not to peer-assessed IA. CONCLUSIONS: Peer assessment of WH, as early as the second year of medical school, can predict later MSPE rankings and internship performance. Although peer-assessed IA can be measured reliably, they are unrelated to either outcome

Crossref

PubMed Central

Instruments to measure the ability to self-reflect:A systematic review of evidence from workplace and educational settings including health care

Author: *Learman LA
Berr SA
Cowden RG
Devi V
Dulewicz V
Dunn L
Hanson K
Hatton N
Holsgrove G
Kamangar F
Kayapinar U
Lucas U
Norcini JJ
Osipova A
Ryan M
Shain L
Sutton A
Tang W
Tricio JA
Tsingos‐Lucas C
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

Introduction: Self-reflection has become recognised as a core skill in dental education, although the ability to self-reflect is valued and measured within several professions. This review appraises the evidence for instruments available to measure the self-reflective ability of adults studying or working within any setting, not just health care. Materials and Methods: A systematic review was conducted of 20 electronic databases (including Medline, ERIC, CINAHL and Business Source Complete) from 1975 to 2017, supplemented by citation searches. Data were extracted from each study and the studies graded against quality indicators by at least two independent reviewers, using a coding sheet. Reviewers completed a utility analysis of the assessment instruments described within included studies, appraising their reported reliability, validity, educational impact, acceptability and cost. Results: A total of 131 studies met the inclusion criteria. Eighteen were judged to provide higher quality evidence for the review and three broad types of instrument were identified, namely: rubrics (or scoring guides), self-reported scales and observed behaviour. Conclusions: Three types of instrument were identified to assess the ability to self-reflect. It was not possible to recommend a single most effective instrument due to under reporting of the criteria necessary for a full utility analysis of each. The use of more than one instrument may therefore be appropriate dependent on the acceptability to the faculty, assessor, student and cost. Future research should report on the utility of assessment instruments and provide guidance on what constitutes thresholds of acceptable or unacceptable ability to self-reflect, and how this should be managed

Crossref

Research Repository

Explore Bristol Research

Avaliando competência clínica: o método de avaliação estruturada observacional

Author: Angélica Maria Bicudo-Zeferino
DeLisa JA
Eliana Amaral
Elliot DL
Epstein RM
Gordon J
Hayes BE
Holmboe ES
Katz NT
Lehmann F
Lynch DC
Marks M
McMullan M
Miller GE
Noel GL
Norcini JJ
Norcini JJ
Norgaard K
Rosângela Curvo Leite Domingues
Smee S
Wass V
Wass V
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref