10 research outputs found

    The Triage Capability of Laypersons: Retrospective Exploratory Analysis

    Get PDF
    Background: Although medical decision-making may be thought of as a task involving health professionals, many decisions, including critical health-related decisions are made by laypersons alone. Specifically, as the first step to most care episodes, it is the patient who determines whether and where to seek health care (triage). Overcautious self-assessments (ie, overtriaging) may lead to overutilization of health care facilities and overcrowded emergency departments, whereas imprudent decisions (ie, undertriaging) constitute a risk to the patient's health. Recently, patient-facing decision support systems, commonly known as symptom checkers, have been developed to assist laypersons in these decisions. Objective: The purpose of this study is to identify factors influencing laypersons' ability to self-triage and their risk averseness in self-triage decisions. Methods: We analyzed publicly available data on 91 laypersons appraising 45 short fictitious patient descriptions (case vignettes; N=4095 appraisals). Using signal detection theory and descriptive and inferential statistics, we explored whether the type of medical decision laypersons face, their confidence in their decision, and sociodemographic factors influence their triage accuracy and the type of errors they make. We distinguished between 2 decisions: whether emergency care was required (decision 1) and whether self-care was sufficient (decision 2). Results: The accuracy of detecting emergencies (decision 1) was higher (mean 82.2%, SD 5.9%) than that of deciding whether any type of medical care is required (decision 2, mean 75.9%, SD 5.25%; t>90=8.4; P89=3.7; P<.001; d=0.39). Conclusions: Our study suggests that laypersons are overcautious in deciding whether they require medical care at all, but they miss identifying a considerable portion of emergencies. Our results further indicate that women are more risk averse than men in both types of decisions. Layperson participants made most triage errors when they were certain of their own appraisal. Thus, they might not follow or even seek advice (eg, from symptom checkers) in most instances where advice would be useful

    Interactive Versus Static Decision Support Tools for COVID-19: Randomized Controlled Trial

    Get PDF
    Background: During the COVID-19 pandemic, medical laypersons with symptoms indicative of a COVID-19 infection commonly sought guidance on whether and where to find medical care. Numerous web-based decision support tools (DSTs) have been developed, both by public and commercial stakeholders, to assist their decision making. Though most of the DSTs’ underlying algorithms are similar and simple decision trees, their mode of presentation differs: some DSTs present a static flowchart, while others are designed as a conversational agent, guiding the user through the decision tree’s nodes step-by-step in an interactive manner. Objective: This study aims to investigate whether interactive DSTs provide greater decision support than noninteractive (ie, static) flowcharts. Methods: We developed mock interfaces for 2 DSTs (1 static, 1 interactive), mimicking patient-facing, freely available DSTs for COVID-19-related self-assessment. Their underlying algorithm was identical and based on the Centers for Disease Control and Prevention’s guidelines. We recruited adult US residents online in November 2020. Participants appraised the appropriate social and care-seeking behavior for 7 fictitious descriptions of patients (case vignettes). Participants in the experimental groups received either the static or the interactive mock DST as support, while the control group appraised the case vignettes unsupported. We determined participants’ accuracy, decision certainty (after deciding), and mental effort to measure the quality of decision support. Participants’ ratings of the DSTs’ usefulness, ease of use, trust, and future intention to use the tools served as measures to analyze differences in participants’ perception of the tools. We used ANOVAs and t tests to assess statistical significance. Results: Our survey yielded 196 responses. The mean number of correct assessments was higher in the intervention groups (interactive DST group: mean 11.71, SD 2.37; static DST group: mean 11.45, SD 2.48) than in the control group (mean 10.17, SD 2.00). Decisional certainty was significantly higher in the experimental groups (interactive DST group: mean 80.7%, SD 14.1%; static DST group: mean 80.5%, SD 15.8%) compared to the control group (mean 65.8%, SD 20.8%). The differences in these measures proved statistically significant in t tests comparing each intervention group with the control group (P<.001 for all 4 t tests). ANOVA detected no significant differences regarding mental effort between the 3 study groups. Differences between the 2 intervention groups were of small effect sizes and nonsignificant for all 3 measures of the quality of decision support and most measures of participants’ perception of the DSTs. Conclusions: When the decision space is limited, as is the case in common COVID-19 self-assessment DSTs, static flowcharts might prove as beneficial in enhancing decision quality as interactive tools. Given that static flowcharts reveal the underlying decision algorithm more transparently and require less effort to develop, they might prove more efficient in providing guidance to the public. Further research should validate our findings on different use cases, elaborate on the trade-off between transparency and convenience in DSTs, and investigate whether subgroups of users benefit more with 1 type of user interface than the other. Trial Registration: Deutsches Register Klinischer Studien DRKS00028136; https://tinyurl.com/4bcfausx (retrospectively registered

    Determinants of Laypersons’ Trust in Medical Decision Aids: Randomized Controlled Trial

    Get PDF
    Background: Symptom checker apps are patient-facing decision support systems aimed at providing advice to laypersons on whether, where, and how to seek health care (disposition advice). Such advice can improve laypersons' self-assessment and ultimately improve medical outcomes. Past research has mainly focused on the accuracy of symptom checker apps' suggestions. To support decision-making, such apps need to provide not only accurate but also trustworthy advice. To date, only few studies have addressed the question of the extent to which laypersons trust symptom checker app advice or the factors that moderate their trust. Studies on general decision support systems have shown that framing automated systems (anthropomorphic or emphasizing expertise), for example, by using icons symbolizing artificial intelligence (AI), affects users' trust. Objective: This study aims to identify the factors influencing laypersons' trust in the advice provided by symptom checker apps. Primarily, we investigated whether designs using anthropomorphic framing or framing the app as an AI increases users' trust compared with no such framing. Methods: Through a web-based survey, we recruited 494 US residents with no professional medical training. The participants had to first appraise the urgency of a fictitious patient description (case vignette). Subsequently, a decision aid (mock symptom checker app) provided disposition advice contradicting the participants' appraisal, and they had to subsequently reappraise the vignette. Participants were randomized into 3 groups: 2 experimental groups using visual framing (anthropomorphic, 160/494, 32.4%, vs AI, 161/494, 32.6%) and a neutral group without such framing (173/494, 35%). Results: Most participants (384/494, 77.7%) followed the decision aid's advice, regardless of its urgency level. Neither anthropomorphic framing (odds ratio 1.120, 95% CI 0.664-1.897) nor framing as AI (odds ratio 0.942, 95% CI 0.565-1.570) increased behavioral or subjective trust (P=.99) compared with the no-frame condition. Even participants who were extremely certain in their own decisions (ie, 100% certain) commonly changed it in favor of the symptom checker's advice (19/34, 56%). Propensity to trust and eHealth literacy were associated with increased subjective trust in the symptom checker (propensity to trust b=0.25; eHealth literacy b=0.2), whereas sociodemographic variables showed no such link with either subjective or behavioral trust. Conclusions: Contrary to our expectation, neither the anthropomorphic framing nor the emphasis on AI increased trust in symptom checker advice compared with that of a neutral control condition. However, independent of the interface, most participants trusted the mock app's advice, even when they were very certain of their own assessment. Thus, the question arises as to whether laypersons use such symptom checkers as substitutes rather than as aids in their own decision-making. With trust in symptom checkers already high at baseline, the benefit of symptom checkers depends on interface designs that enable users to adequately calibrate their trust levels during usage

    Improvements in Patient Monitoring in the Intensive Care Unit: Survey Study

    Get PDF
    Background: Due to demographic change and, more recently, coronavirus disease (COVID-19), the importance of modern intensive care units (ICU) is becoming apparent. One of the key components of an ICU is the continuous monitoring of patients' vital parameters. However, existing advances in informatics, signal processing, or engineering that could alleviate the burden on ICUs have not yet been applied. This could be due to the lack of user involvement in research and development. Objective: This study focused on the satisfaction of ICU staff with current patient monitoring and their suggestions for future improvements. We aimed to identify aspects of monitoring that interrupt patient care, display devices for remote monitoring, use cases for artificial intelligence (AI), and whether ICU staff members are willing to improve their digital literacy or contribute to the improvement of patient monitoring. We further aimed to identify differences in the responses of different professional groups. Methods: This survey study was performed with ICU staff from 4 ICUs of a German university hospital between November 2019 and January 2020. We developed a web-based 36-item survey questionnaire, by analyzing a preceding qualitative interview study with ICU staff, about the clinical requirements of future patient monitoring. Statistical analyses of questionnaire results included median values with their bootstrapped 95% confidence intervals, and chi-square tests to compare the distributions of item responses of the professional groups. Results: In total, 86 of the 270 ICU physicians and nurses completed the survey questionnaire. The majority stated they felt confident using the patient monitoring equipment, but that high rates of false-positive alarms and the many sensor cables interrupted patient care. Regarding future improvements, respondents asked for wireless sensors, a reduction in the number of false-positive alarms, and hospital standard operating procedures for alarm management. Responses to the display devices proposed for remote patient monitoring were divided. Most respondents indicated it would be useful for earlier alerting or when they were responsible for multiple wards. AI for ICUs would be useful for early detection of complications and an increased risk of mortality; in addition, the AI could propose guidelines for therapy and diagnostics. Transparency, interoperability, usability, and staff training were essential to promote the use of AI. The majority wanted to learn more about new technologies for the ICU and required more time for learning. Physicians had fewer reservations than nurses about AI-based intelligent alarm management and using mobile phones for remote monitoring. Conclusions: This survey study of ICU staff revealed key improvements for patient monitoring in intensive care medicine. Hospital providers and medical device manufacturers should focus on reducing false alarms, implementing hospital alarm standard operating procedures, introducing wireless sensors, preparing for the use of AI, and enhancing the digital literacy of ICU staff. Our results may contribute to the user-centered transfer of digital technologies into practice to alleviate challenges in intensive care medicine. Trial registration: ClinicalTrials.gov NCT03514173; https://clinicaltrials.gov/ct2/show/NCT03514173

    Predicting lethal courses in critically ill COVID-19 patients using a machine learning model trained on patients with non-COVID-19 viral pneumonia

    Get PDF
    In a pandemic with a novel disease, disease-specific prognosis models are available only with a delay. To bridge the critical early phase, models built for similar diseases might be applied. To test the accuracy of such a knowledge transfer, we investigated how precise lethal courses in critically ill COVID-19 patients can be predicted by a model trained on critically ill non-COVID-19 viral pneumonia patients. We trained gradient boosted decision tree models on 718 (245 deceased) non-COVID-19 viral pneumonia patients to predict individual ICU mortality and applied it to 1054 (369 deceased) COVID-19 patients. Our model showed a significantly better predictive performance (AUROC 0.86 [95% CI 0.86-0.87]) than the clinical scores APACHE2 (0.63 [95% CI 0.61-0.65]), SAPS2 (0.72 [95% CI 0.71-0.74]) and SOFA (0.76 [95% CI 0.75-0.77]), the COVID-19-specific mortality prediction models of Zhou (0.76 [95% CI 0.73-0.78]) and Wang (laboratory: 0.62 [95% CI 0.59-0.65]; clinical: 0.56 [95% CI 0.55-0.58]) and the 4C COVID-19 Mortality score (0.71 [95% CI 0.70-0.72]). We conclude that lethal courses in critically ill COVID-19 patients can be predicted by a machine learning model trained on non-COVID-19 patients. Our results suggest that in a pandemic with a novel disease, prognosis models built for similar diseases can be applied, even when the diseases differ in time courses and in rates of critical and lethal courses

    Vergleichende Beurteilung der Fähigkeit die Dringlichkeit medizinischer Beschwerdebilder einzuschätzen zwischen 15 Symptom Checker Apps und medizinischen Laien

    Get PDF
    Background: Symptom checkers are digital health applications (smartphone applications or website-based applications) to support laypersons in clinical decision making. Besides providing suggestions on probable diagnoses, symptom checkers appraise the urgency of patient reported medical complaints (triage recommendation). Despite past studies rating the accuracy of symptom checkers as deficient, these apps are becoming increasingly popular among the general public. Until now, no study has evaluated whether symptom checker triage accuracy is superior to that of their intended user group, that is laypersons. Methods: In an online survey, participants had to assess the treatment urgency of 45 fictitious, short patient descriptions (case vignettes). We recruited 91 US participants via the platform Amazon Mechanical Turk. Data on triage accuracy for 15 symptom checkers on the same case vignettes was provided by a previous study. We compared the triage accuracy between symptom checkers and laypersons, for all 45 vignettes and for each of three urgency levels. We further investigated whether laypersons and symptom checkers are inclined towards over-triage or under-triage. In exploratory analyses we searched for effects of age, gender and level education on participants’ triage accuracy and inclination towards over-triaging. Results: On average, participants’ triage accuracy (60.9%; 95% CI 59.5%-62.3%) was similar to that of symptom checkers (58%). The majority of participants outperformed ten out of 15 symptom checkers in terms of overall triage accuracy. Both participants and symptom checkers were inclined towards over-triage rather than under-triage. We detected no influence of socio-demographic variables on participants’ triage accuracy. Female participants had a higher ratio of over-triage to under-triage (2:1) errors than male participants (1.2:1). Discussion: While on average symptom checkers have no superior triage accuracy than laypersons, five symptom checkers outperformed the majority of participants. Whether symptom checker usage is beneficial, depends not only on the symptom checker, but also on the user and the specific use case. Future studies should investigate how symptom checkers can balance out laypersons’ deficits and blind spots while not misleading them when their own intuition proves correct. Future research on when and why laypersons trust symptom checker appraisals will prove valuable.Hintergrund: Symptom Checker Apps sind digitale Anwendungen (Smartphone Apps oder Webseiten) die Laiennutzer bei klinischen Entscheidungen unterstützen. Neben der Einschätzung, welche Diagnosen für ein beschriebenes Beschwerdebild wahrscheinlich seien, geben sie auch oft Empfehlungen, ob und wo ein Nutzer medizinische Hilfe aufsuchen sollte ("Triage Empfehlung"). Obwohl die Genauigkeit von Symptom Checker Apps in unabhängigen Studien bisher eher als unzureichend bewertet wurde, erfreuen sich solche Apps wachsender Beliebtheit. Bisher wurde noch nicht verglichen, ob solche Apps medizinischen Laien bei der Triage-Einschätzung überlegen sind. Methoden: Auf Amazon MTurk haben wir 91 US-amerikanische Probanden rekrutiert. In einer Online-Umfrage schätzten die Probanden die Dringlichkeit von 45 fiktiven, kurzen Fallvignetten ein. Daten zu 15 Symptom Checker Apps, die anhand der gleichen 45 Fallvignetten getestet wurden, wurden aus einer vorigen Studie übernommen. Wir verglichen die Genauigkeit der Triage-Einschätzung zwischen Symptom Checker Apps und den Laien, bezogen auf alle 45 Fallvignetten und pro Dringlichkeitsstufe. Zudem wurde bestimmt, ob die Apps und Laien eher über- oder untertriagieren. Explorativ haben wir analysiert, ob Alter, Geschlecht und Bildungshintergrund einen Einfluss auf die Triage-Genauigkeit und die Neigung zur Übertriage haben bei den Laien haben. Ergebnisse: Im Gesamtdurchschnitt waren die Triage-Genauigkeiten der Probanden (60.9%; 95% KI 59.5%-62.3%) und Symptom Checker Apps (58%) sehr ähnlich. Der Mehrheit der Probanden gelang es, besser als zehn von 15 Symptom Checker Apps zu triagieren. Sowohl die Symptom Checker als auch die Laien machten mehr Übertriage-Fehler als Untertriage-Fehler. Einen Einfluss soziodemographischer Merkmale auf die Triage-Genauigkeit bei den Laien zeigte sich nicht. Das Verhältnis von Übertriage- zu Untertriage-Fehlern war bei Frauen (2:1) höher als bei Männern (1.2:1). Diskussion: Während die meisten Symptom Checker Apps keine höhere Triage-Genauigkeit hatten als der durchschnittliche Proband, gab es fünf Apps, die der deutlichen Mehrheit der Probanden überlegen war. Ob die Verwendung von Symptom Checker Apps nützlich ist, hängt nicht nur ab von der Fähigkeit solcher Apps, sondern auch von denen ihrer Nutzer sowie den spezifischen Anwendungsfall. Weitere Studien sollten untersuchen, wie Symptom Checker Apps die Defizite ihrer Nutzer ausgleichen können, ohne sie fehlzuleiten, wenn die Nutzer in ihrer Einschätzung richtig liegen. Erkenntnisse dazu, in welchen Fällen und warum Nutzer den Einschätzungen von Symptom Checker Apps trauen, werden hierbei sehr wertvoll sein

    Benchmarking Triage Capability of Symptom Checkers Against That of Medical Laypersons: Survey Study

    Get PDF
    Background: Symptom checkers (SCs) are tools developed to provide clinical decision support to laypersons. Apart from suggesting probable diagnoses, they commonly advise when users should seek care (triage advice). SCs have become increasingly popular despite prior studies rating their performance as mediocre. To date, it is unclear whether SCs can triage better than those who might choose to use them. Objective: This study aims to compare triage accuracy between SCs and their potential users (ie, laypersons). Methods: On Amazon Mechanical Turk, we recruited 91 adults from the United States who had no professional medical background. In a web-based survey, the participants evaluated 45 fictitious clinical case vignettes. Data for 15 SCs that had processed the same vignettes were obtained from a previous study. As main outcome measures, we assessed the accuracy of the triage assessments made by participants and SCs for each of the three triage levels (ie, emergency care, nonemergency care, self-care) and overall, the proportion of participants outperforming each SC in terms of accuracy, and the risk aversion of participants and SCs by comparing the proportion of cases that were overtriaged. Results: The mean overall triage accuracy was similar for participants (60.9%, SD 6.8%; 95% CI 59.5%-62.3%) and SCs (58%, SD 12.8%). Most participants outperformed all but 5 SCs. On average, SCs more reliably detected emergencies (80.6%, SD 17.9%) than laypersons did (67.5%, SD 16.4%; 95% CI 64.1%-70.8%). Although both SCs and participants struggled with cases requiring self-care (the least urgent triage category), SCs more often wrongly classified these cases as emergencies (43/174, 24.7%) compared with laypersons (56/1365, 4.10%). Conclusions: Most SCs had no greater triage capability than an average layperson, although the triage accuracy of the five best SCs was superior to the accuracy of most participants. SCs might improve early detection of emergencies but might also needlessly increase resource utilization in health care. Laypersons sometimes require support in deciding when to rely on self-care but it is in that very situation where SCs perform the worst. Further research is needed to determine how to best combine the strengths of humans and SCs

    Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation

    Get PDF
    Background: Symptom checkers are digital tools assisting laypersons in self-assessing the urgency and potential causes of their medical complaints. They are widely used but face concerns from both patients and health care professionals, especially regarding their accuracy. A 2015 landmark study substantiated these concerns using case vignettes to demonstrate that symptom checkers commonly err in their triage assessment. Objective: This study aims to revisit the landmark index study to investigate whether and how symptom checkers' capabilities have evolved since 2015 and how they currently compare with laypersons' stand-alone triage appraisal. Methods: In early 2020, we searched for smartphone and web-based applications providing triage advice. We evaluated these apps on the same 45 case vignettes as the index study. Using descriptive statistics, we compared our findings with those of the index study and with publicly available data on laypersons' triage capability. Results: We retrieved 22 symptom checkers providing triage advice. The median triage accuracy in 2020 (55.8%, IQR 15.1%) was close to that in 2015 (59.1%, IQR 15.5%). The apps in 2020 were less risk averse (odds 1.11:1, the ratio of overtriage errors to undertriage errors) than those in 2015 (odds 2.82:1), missing >40% of emergencies. Few apps outperformed laypersons in either deciding whether emergency care was required or whether self-care was sufficient. No apps outperformed the laypersons on both decisions. Conclusions: Triage performance of symptom checkers has, on average, not improved over the course of 5 years. It decreased in 2 use cases (advice on when emergency care is required and when no health care is needed for the moment). However, triage capability varies widely within the sample of symptom checkers. Whether it is beneficial to seek advice from symptom checkers depends on the app chosen and on the specific question to be answered. Future research should develop resources (eg, case vignette repositories) to audit the capabilities of symptom checkers continuously and independently and provide guidance on when and to whom they should be recommended

    Staff perspectives on the influence of patient characteristics on alarm management in the intensive care unit: a cross-sectional survey study

    No full text
    Abstract Background High rates of clinical alarms in the intensive care unit can result in alarm fatigue among staff. Individualization of alarm thresholds is regarded as one measure to reduce non-actionable alarms. The aim of this study was to investigate staff’s perceptions of alarm threshold individualization according to patient characteristics and disease status. Methods This is a cross-sectional survey study (February-July 2020). Intensive care nurses and physicians were sampled by convenience. Data was collected using an online questionnaire. Results Staff view the individualization of alarm thresholds in the monitoring of vital signs as important. The extent to which alarm thresholds are adapted from the normal range varies depending on the vital sign monitored, the reason for clinical deterioration, and the professional group asked. Vital signs used for hemodynamic monitoring (heart rate and blood pressure) were most subject to alarm individualizations. Staff are ambivalent regarding the integration of novel technological features into alarm management. Conclusions All relevant stakeholders, including clinicians, hospital management, and industry, must collaborate to establish a “standard for individualization,” moving away from ad hoc alarm management to an intelligent, data-driven alarm management. Making alarms meaningful and trustworthy again has the potential to mitigate alarm fatigue – a major cause of stress in clinical staff and considerable hazard to patient safety. Trial registration The study was registered at ClinicalTrials.gov (NCT03514173) on 02/05/2018
    corecore