8 research outputs found

    Beyond the Hype: Assessing the Performance, Trustworthiness, and Clinical Suitability of GPT3.5

    Full text link
    The use of large language models (LLMs) in healthcare is gaining popularity, but their practicality and safety in clinical settings have not been thoroughly assessed. In high-stakes environments like medical settings, trust and safety are critical issues for LLMs. To address these concerns, we present an approach to evaluate the performance and trustworthiness of a GPT3.5 model for medical image protocol assignment. We compare it with a fine-tuned BERT model and a radiologist. In addition, we have a radiologist review the GPT3.5 output to evaluate its decision-making process. Our evaluation dataset consists of 4,700 physician entries across 11 imaging protocol classes spanning the entire head. Our findings suggest that the GPT3.5 performance falls behind BERT and a radiologist. However, GPT3.5 outperforms BERT in its ability to explain its decision, detect relevant word indicators, and model calibration. Furthermore, by analyzing the explanations of GPT3.5 for misclassifications, we reveal systematic errors that need to be resolved to enhance its safety and suitability for clinical use

    Autonomous sweat extraction and analysis applied to cystic fibrosis and glucose monitoring using a fully integrated wearable platform

    Get PDF
    Perspiration-based wearable biosensors facilitate continuous monitoring of individuals’ health states with real-time and molecular-level insight. The inherent inaccessibility of sweat in sedentary individuals in large volume (≥10 µL) for on-demand and in situ analysis has limited our ability to capitalize on this noninvasive and rich source of information. A wearable and miniaturized iontophoresis interface is an excellent solution to overcome this barrier. The iontophoresis process involves delivery of stimulating agonists to the sweat glands with the aid of an electrical current. The challenge remains in devising an iontophoresis interface that can extract sufficient amount of sweat for robust sensing, without electrode corrosion and burning/causing discomfort in subjects. Here, we overcame this challenge through realizing an electrochemically enhanced iontophoresis interface, integrated in a wearable sweat analysis platform. This interface can be programmed to induce sweat with various secretion profiles for real-time analysis, a capability which can be exploited to advance our knowledge of the sweat gland physiology and the secretion process. To demonstrate the clinical value of our platform, human subject studies were performed in the context of the cystic fibrosis diagnosis and preliminary investigation of the blood/sweat glucose correlation. With our platform, we detected the elevated sweat electrolyte content of cystic fibrosis patients compared with that of healthy control subjects. Furthermore, our results indicate that oral glucose consumption in the fasting state is followed by increased glucose levels in both sweat and blood. Our solution opens the possibility for a broad range of noninvasive diagnostic and general population health monitoring applications

    Autonomous sweat extraction and analysis applied to cystic fibrosis and glucose monitoring using a fully integrated wearable platform

    Get PDF
    Perspiration-based wearable biosensors facilitate continuous monitoring of individuals’ health states with real-time and molecular-level insight. The inherent inaccessibility of sweat in sedentary individuals in large volume (≥10 µL) for on-demand and in situ analysis has limited our ability to capitalize on this noninvasive and rich source of information. A wearable and miniaturized iontophoresis interface is an excellent solution to overcome this barrier. The iontophoresis process involves delivery of stimulating agonists to the sweat glands with the aid of an electrical current. The challenge remains in devising an iontophoresis interface that can extract sufficient amount of sweat for robust sensing, without electrode corrosion and burning/causing discomfort in subjects. Here, we overcame this challenge through realizing an electrochemically enhanced iontophoresis interface, integrated in a wearable sweat analysis platform. This interface can be programmed to induce sweat with various secretion profiles for real-time analysis, a capability which can be exploited to advance our knowledge of the sweat gland physiology and the secretion process. To demonstrate the clinical value of our platform, human subject studies were performed in the context of the cystic fibrosis diagnosis and preliminary investigation of the blood/sweat glucose correlation. With our platform, we detected the elevated sweat electrolyte content of cystic fibrosis patients compared with that of healthy control subjects. Furthermore, our results indicate that oral glucose consumption in the fasting state is followed by increased glucose levels in both sweat and blood. Our solution opens the possibility for a broad range of noninvasive diagnostic and general population health monitoring applications

    Exploring the performance and explainability of fine-tuned BERT models for neuroradiology protocol assignment

    No full text
    Abstract Background Deep learning has demonstrated significant advancements across various domains. However, its implementation in specialized areas, such as medical settings, remains approached with caution. In these high-stake environments, understanding the model's decision-making process is critical. This study assesses the performance of different pretrained Bidirectional Encoder Representations from Transformers (BERT) models and delves into understanding its decision-making within the context of medical image protocol assignment. Methods Four different pre-trained BERT models (BERT, BioBERT, ClinicalBERT, RoBERTa) were fine-tuned for the medical image protocol classification task. Word importance was measured by attributing the classification output to every word using a gradient-based method. Subsequently, a trained radiologist reviewed the resulting word importance scores to assess the model’s decision-making process relative to human reasoning. Results The BERT model came close to human performance on our test set. The BERT model successfully identified relevant words indicative of the target protocol. Analysis of important words in misclassifications revealed potential systematic errors in the model. Conclusions The BERT model shows promise in medical image protocol assignment by reaching near human level performance and identifying key words effectively. The detection of systematic errors paves the way for further refinements to enhance its safety and utility in clinical settings
    corecore