Search CORE

2 research outputs found

Addressing cognitive bias in medical language models

Author: Abadir Peter
Chellappa Rama
Eshraghian Jason
Essien Ime
Harris Carl
Kim Ji Woong
Olshvang Daniel
Rahman Tawsifur
Schmidgall Samuel
Ziaei Rojin
Publication venue
Publication date: 20/02/2024
Field of study

There is increasing interest in the application large language models (LLMs) to the medical field, in part because of their impressive performance on medical exam questions. While promising, exam questions do not reflect the complexity of real patient-doctor interactions. In reality, physicians' decisions are shaped by many complex factors, such as patient compliance, personal experience, ethical beliefs, and cognitive bias. Taking a step toward understanding this, our hypothesis posits that when LLMs are confronted with clinical questions containing cognitive biases, they will yield significantly less accurate responses compared to the same questions presented without such biases. In this study, we developed BiasMedQA, a benchmark for evaluating cognitive biases in LLMs applied to medical tasks. Using BiasMedQA we evaluated six LLMs, namely GPT-4, Mixtral-8x70B, GPT-3.5, PaLM-2, Llama 2 70B-chat, and the medically specialized PMC Llama 13B. We tested these models on 1,273 questions from the US Medical Licensing Exam (USMLE) Steps 1, 2, and 3, modified to replicate common clinically-relevant cognitive biases. Our analysis revealed varying effects for biases on these LLMs, with GPT-4 standing out for its resilience to bias, in contrast to Llama 2 70B-chat and PMC Llama 13B, which were disproportionately affected by cognitive bias. Our findings highlight the critical need for bias mitigation in the development of medical LLMs, pointing towards safer and more reliable applications in healthcare

arXiv.org e-Print Archive

NHANES compiled data

Author: Daniel Olshvang (17959544)
Publication venue
Publication date: 13/02/2024
Field of study

NHANES data is publicly available. The collated input data for the manuscript 'Predictive modeling of lean body mass, appendicular lean mass, and appendicular skeletal muscle mass using machine learning techniques: a comprehensive analysis utilizing NHANES data and the Look AHEAD study' is available here.</p

FigShare