150 research outputs found

    LLMs Understand Glass-Box Models, Discover Surprises, and Suggest Repairs

    Full text link
    We show that large language models (LLMs) are remarkably good at working with interpretable models that decompose complex outcomes into univariate graph-represented components. By adopting a hierarchical approach to reasoning, LLMs can provide comprehensive model-level summaries without ever requiring the entire model to fit in context. This approach enables LLMs to apply their extensive background knowledge to automate common tasks in data science such as detecting anomalies that contradict prior knowledge, describing potential reasons for the anomalies, and suggesting repairs that would remove the anomalies. We use multiple examples in healthcare to demonstrate the utility of these new capabilities of LLMs, with particular emphasis on Generalized Additive Models (GAMs). Finally, we present the package TalkToEBM\texttt{TalkToEBM} as an open-source LLM-GAM interface

    The Surviving Sepsis Campaign: research priorities for the administration, epidemiology, scoring and identification of sepsis

    Get PDF
    Epidemiologia; Disfunció d'òrgans; SèpsiaEpidemiology; Organ dysfunction; SepsisEpidemiología; Disfunción de órganos; SepsisObjective To identify priorities for administrative, epidemiologic and diagnostic research in sepsis. Design As a follow-up to a previous consensus statement about sepsis research, members of the Surviving Sepsis Campaign Research Committee, representing the European Society of Intensive Care Medicine and the Society of Critical Care Medicine addressed six questions regarding care delivery, epidemiology, organ dysfunction, screening, identification of septic shock, and information that can predict outcomes in sepsis. Methods Six questions from the Scoring/Identification and Administration sections of the original Research Priorities publication were explored in greater detail to better examine the knowledge gaps and rationales for questions that were previously identified through a consensus process. Results The document provides a framework for priorities in research to address the following questions: (1) What is the optimal model of delivering sepsis care?; (2) What is the epidemiology of sepsis susceptibility and response to treatment?; (3) What information identifies organ dysfunction?; (4) How can we screen for sepsis in various settings?; (5) How do we identify septic shock?; and (6) What in-hospital clinical information is associated with important outcomes in patients with sepsis? Conclusions There is substantial knowledge of sepsis epidemiology and ways to identify and treat sepsis patients, but many gaps remain. Areas of uncertainty identified in this manuscript can help prioritize initiatives to improve an understanding of individual patient and demographic heterogeneity with sepsis and septic shock, biomarkers and accurate patient identification, organ dysfunction, and ways to improve sepsis care.The authors volunteered their time to producing this manuscript and no funding was used to produce it

    A Smartphone-based Decision Support Tool Improves Test Performance Concerning Application of the Guidelines for Managing Regional Anesthesia in the Patient Receiving Antithrombotic or Thrombolytic Therapy

    Get PDF
    BACKGROUND: The American Society of Regional Anesthesia and Pain Medicine (ASRA) consensus statement on regional anesthesia in the patient receiving antithrombotic or thrombolytic therapy is the standard for evaluation and management of these patients. The authors hypothesized that an electronic decision support tool (eDST) would improve test performance compared with native physician behavior concerning the application of this guideline. METHODS: Anesthesiology trainees and faculty at 8 institutions participated in a prospective, randomized trial in which they completed a 20-question test involving clinical scenarios related to the ASRA guidelines. The eDST group completed the test using an iOS app programmed to contain decision logic and content of the ASRA guidelines. The control group completed the test by using any resource in addition to the app. A generalized linear mixed-effects model was used to examine the effect of the intervention. RESULTS: After obtaining institutional review board's approval and informed consent, 259 participants were enrolled and randomized (eDST = 122; control = 137). The mean score was 92.4 ± 6.6% in the eDST group and 68.0 ± 15.8% in the control group (P < 0.001). eDST use increased the odds of selecting correct answers (7.8; 95% CI, 5.7 to 10.7). Most control group participants (63%) used some cognitive aid during the test, and they scored higher than those who tested from memory alone (76 ± 15% vs. 57 ± 18%, P < 0.001). There was no difference in time to completion of the test (P = 0.15) and no effect of training level (P = 0.56). CONCLUSIONS: eDST use improved application of the ASRA guidelines compared with the native clinician behavior in a testing environment

    Testing for allergic disease: Parameters considered and test value

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Test results for allergic disease are especially valuable to allergists and family physicians for clinical evaluation, decisions to treat, and to determine needs for referral.</p> <p>Methods</p> <p>This study used a repeated measures design (conjoint analysis) to examine trade offs among clinical parameters that influence the decision of family physicians to use specific IgE blood testing as a diagnostic aid for patients suspected of having allergic rhinitis. Data were extracted from a random sample of 50 family physicians in the Southeastern United States. Physicians evaluated 11 patient profiles containing four clinical parameters: symptom severity (low, medium, high), symptom length (5, 10, 20 years), family history (both parents, mother, neither), and medication use (prescribed antihistamines, nasal spray, over-the-counter medications). Decision to recommend specific IgE testing was elicited as a "yes" or "no" response. Perceived value of specific IgE blood testing was evaluated according to usefulness as a diagnostic tool compared to skin testing, and not testing.</p> <p>Results</p> <p>The highest odds ratios (OR) associated with decisions to test for allergic rhinitis were obtained for symptom severity (OR, 12.11; 95%CI, 7.1–20.7) and length of symptoms (OR, 1.46; 95%CI, 0.96–2.2) with family history having significant influence in the decision. A moderately positive association between testing issues and testing value was revealed (β = 0.624, <it>t </it>= 5.296, <it>p </it>≤ 0.001) with 39% of the variance explained by the regression model.</p> <p>Conclusion</p> <p>The most important parameters considered when testing for allergic rhinitis relate to symptom severity, length of symptoms, and family history. Family physicians recognize that specific IgE blood testing is valuable to their practice.</p

    Quality of chronic disease care in general practice: the development and validation of a provider interview tool

    Get PDF
    BACKGROUND: This article describes the development and psychometric evaluation of an interview instrument to assess provider-reported quality of general practice care for patients with diabetes, cardiovascular disease and asthma – the Australian General Practice Clinical Care Interview (GPCCI). METHODS: We administered the GPCCI to 28 general practitioners (family physicians) in 10 general practices. We conducted an item analysis and assessed the internal consistency of the instrument. We next assessed the quality of care recorded in the medical records of 462 of the general practitioners' patients with Type 2 diabetes, ischaemic heart disease/hypertension and/or moderate to severe asthma. This was then compared with results of the GPCCI for each general practice. RESULTS: Good internal consistency was found for the overall GPCCI (Cronbach's alpha = 0.75). As far as the separate sub-scales were concerned, diabetes had good internal consistency (0.76) but the internal consistency of the heart disease and asthma subscales was not strong (0.49 and 0.16 respectively). There was high inter-rater reliability of the adjusted scores of data extracted from patients' medical notes for each of the three conditions. Correlations of the overall GPCCI and patients' medical notes audit, combined across the three conditions and aggregated to practice level, showed that a strong relationship (r = 0.84, p = 0.003) existed between the two indices of clinical care. CONCLUSION: This study suggests that the GPCCI has good internal consistency and concurrent validity with patients' medical records in Australian general practice and warrants further evaluation of its properties, validity and utility

    Assessing an organizational culture instrument based on the Competing Values Framework: Exploratory and confirmatory factor analyses

    Get PDF
    BACKGROUND: The Competing Values Framework (CVF) has been widely used in health services research to assess organizational culture as a predictor of quality improvement implementation, employee and patient satisfaction, and team functioning, among other outcomes. CVF instruments generally are presented as well-validated with reliable aggregated subscales. However, only one study in the health sector has been conducted for the express purpose of validation, and that study population was limited to hospital managers from a single geographic locale. METHODS: We used exploratory and confirmatory factor analyses to examine the underlying structure of data from a CVF instrument. We analyzed cross-sectional data from a work environment survey conducted in the Veterans Health Administration (VHA). The study population comprised all staff in non-supervisory positions. The survey included 14 items adapted from a popular CVF instrument, which measures organizational culture according to four subscales: hierarchical, entrepreneurial, team, and rational. RESULTS: Data from 71,776 non-supervisory employees (approximate response rate 51%) from 168 VHA facilities were used in this analysis. Internal consistency of the subscales was moderate to strong (α = 0.68 to 0.85). However, the entrepreneurial, team, and rational subscales had higher correlations across subscales than within, indicating poor divergent properties. Exploratory factor analysis revealed two factors, comprising the ten items from the entrepreneurial, team, and rational subscales loading on the first factor, and two items from the hierarchical subscale loading on the second factor, along with one item from the rational subscale that cross-loaded on both factors. Results from confirmatory factor analysis suggested that the two-subscale solution provides a more parsimonious fit to the data as compared to the original four-subscale model. CONCLUSION: This study suggests that there may be problems applying conventional CVF subscales to non-supervisors, and underscores the importance of assessing psychometric properties of instruments in each new context and population to which they are applied. It also further highlights the challenges management scholars face in assessing organizational culture in a reliable and comparable way. More research is needed to determine if the emergent two-subscale solution is a valid or meaningful alternative and whether these findings generalize beyond VHA

    Validation of the conceptual research utilization scale: an application of the standards for educational and psychological testing in healthcare

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There is a lack of acceptable, reliable, and valid survey instruments to measure conceptual research utilization (CRU). In this study, we investigated the psychometric properties of a newly developed scale (the CRU Scale).</p> <p>Methods</p> <p>We used the <it>Standards for Educational and Psychological Testing </it>as a validation framework to assess four sources of validity evidence: content, response processes, internal structure, and relations to other variables. A panel of nine international research utilization experts performed a formal content validity assessment. To determine response process validity, we conducted a series of one-on-one scale administration sessions with 10 healthcare aides. Internal structure and relations to other variables validity was examined using CRU Scale response data from a sample of 707 healthcare aides working in 30 urban Canadian nursing homes. Principal components analysis and confirmatory factor analyses were conducted to determine internal structure. Relations to other variables were examined using: (1) bivariate correlations; (2) change in mean values of CRU with increasing levels of other kinds of research utilization; and (3) multivariate linear regression.</p> <p>Results</p> <p>Content validity index scores for the five items ranged from 0.55 to 1.00. The principal components analysis predicted a 5-item 1-factor model. This was inconsistent with the findings from the confirmatory factor analysis, which showed best fit for a 4-item 1-factor model. Bivariate associations between CRU and other kinds of research utilization were statistically significant (p < 0.01) for the latent CRU scale score and all five CRU items. The CRU scale score was also shown to be significant predictor of overall research utilization in multivariate linear regression.</p> <p>Conclusions</p> <p>The CRU scale showed acceptable initial psychometric properties with respect to responses from healthcare aides in nursing homes. Based on our validity, reliability, and acceptability analyses, we recommend using a reduced (four-item) version of the CRU scale to yield sound assessments of CRU by healthcare aides. Refinement to the wording of one item is also needed. Planned future research will include: latent scale scoring, identification of variables that predict and are outcomes to conceptual research use, and longitudinal work to determine CRU Scale sensitivity to change.</p

    Legitimacy, Visibility, and the Antecedents of Corporate Social Performance: An Investigation of the Instrumental Perspective

    Get PDF
    Using institutional theory as the foundation, this study examines the role of organizational visibility from a variety of sources (i.e., slack visibility, industry visibility, and visibility to multiple stakeholders) in influencing corporate social performance (CSP). The conceptual framework offers important insights regarding the instrumental motives of managers in performing CSP initiatives. Based on a sample of 124 S&P 500 firms, the authors found that it is a firm’s visibility to stakeholders, rather than its economic performance, that has the larger impact on managers’ decisions regarding how much CSP their firms exhibit. The results show that more profitable firms may not be motivated to engage actively in CSP unless they are under greater scrutiny by various firm stakeholders. The authors also found that organizational slack (estimated as cost of capital) is positively associated with a Social CSP dimension but negatively associated with a Strategic CSP dimension. This research contributes to the current CSP literature by demonstrating that motivations in addition to normative or ethical ones may be at play in the decisions firms make regarding their CSP.Yeshttps://us.sagepub.com/en-us/nam/manuscript-submission-guideline
    • …
    corecore