These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
Pubmed for Handhelds
PUBMED FOR HANDHELDS
Search MEDLINE/PubMed
Title: Simulation-based examinations in physician assistant education: A comparison of two standard-setting methods. Author: Carlson J, Tomkowiak J, Knott P. Journal: J Physician Assist Educ; 2010; 21(2):7-14. PubMed ID: 21141047. Abstract: PURPOSE: this study explored the reliability of two simple standard-setting methods that are used to set passing standards for a standardized patient (SP) exam in physician assistant (PA) education. METHODS: fifty-four second-year PA students participated in a multistation SP-based clinical skills exam. Cut scores were set using the Angoff and Borderline Group methods. A panel of PA faculty set cut scores using the Angoff method. A modified version of the Borderline Group method set cut scores using SP global ratings verified by faculty review. Inter-rater reliability between judges was evaluated using kappa coefficient (k) for the Angoff method and intraclass correlation coefficient (ICC) for the Borderline Group method. RESULTS: the Borderline Group method set an overall cut score for the exam of 76% (95% CI +/- 5) and the Angoff method set a cut score at 62% (95% CI +/- 9). Both methods demonstrated sufficient inter-rater reliability (k 0.60, ICC > 0.70; both significant at p < 0.05), although one case (preop history and physical) demonstrated poor inter-rater reliability between judges using the Borderline Group method. DISCUSSION: the Borderline Group method offered a slightly more reliable cut score when compared to the standard set by the Angoff method, but was more challenging to implement. In addition, one case demonstrated poor inter-rater reliability with the Borderline Group method. Using SPs to complete global borderline ratings offers one solution to make the Borderline Group method more feasibile, but requires a high degree of initial rater calibration and periodic measures of interrater reliability between faculty and SPs.[Abstract] [Full Text] [Related] [New Search]