793 research outputs found

    Design and Analysis of Randomized Controlled Trials in Traumatic Brain Injury

    Get PDF
    Randomized controlled trials in traumatic brain injury (TBI) are challenging due to the inherent heterogeneity of the patient population, the lack of early mechanistic end points, and relative insensitivity of outcome measures. Approaches to deal with the heterogeneity of the patient population are presented in this thesis. The use of strict enrollment criteria is not recommended, as this is inefficient. Rather, broad enrollment criteria may be preferred combined with covariate adjustment in the analysis phase. Dichotomization of the Glasgow Outcome Scale as the primary outcome measure in most trials is not recommended. Ordinal approaches to analysis of tr

    Generative Language Models Exhibit Social Identity Biases

    Full text link
    The surge in popularity of large language models has given rise to concerns about biases that these models could learn from humans. In this study, we investigate whether ingroup solidarity and outgroup hostility, fundamental social biases known from social science, are present in 51 large language models. We find that almost all foundational language models and some instruction fine-tuned models exhibit clear ingroup-positive and outgroup-negative biases when prompted to complete sentences (e.g., "We are..."). A comparison of LLM-generated sentences with human-written sentences on the internet reveals that these models exhibit similar level, if not greater, levels of bias than human text. To investigate where these biases stem from, we experimentally varied the amount of ingroup-positive or outgroup-negative sentences the model was exposed to during fine-tuning in the context of the United States Democrat-Republican divide. Doing so resulted in the models exhibiting a marked increase in ingroup solidarity and an even greater increase in outgroup hostility. Furthermore, removing either ingroup-positive or outgroup-negative sentences (or both) from the fine-tuning data leads to a significant reduction in both ingroup solidarity and outgroup hostility, suggesting that biases can be reduced by removing biased training data. Our findings suggest that modern language models exhibit fundamental social identity biases and that such biases can be mitigated by curating training data. Our results have practical implications for creating less biased large-language models and further underscore the need for more research into user interactions with LLMs to prevent potential bias reinforcement in humans.Comment: supplementary material, data, and code see https://osf.io/9ht32/?view_only=f0ab4b23325f4c31ad3e12a7353b55f

    Pitfalls of single-study external validation illustrated with a model predicting functional outcome after aneurysmal subarachnoid hemorrhage

    Get PDF
    Background: Prediction models are often externally validated with data from a single study or cohort. However, the interpretation of performance estimates obtained with single-study external validation is not as straightforward as assumed. We aimed to illustrate this by conducting a large number of external validations of a prediction model for functional outcome in subarachnoid hemorrhage (SAH) patients.Methods: We used data from the Subarachnoid Hemorrhage International Trialists (SAHIT) data repository (n = 11,931, 14 studies) to refit the SAHIT model for predicting a dichotomous functional outcome (favorable versus unfavorable), with the (extended) Glasgow Outcome Scale or modified Rankin Scale score, at a minimum of three months after discharge. We performed leave-one-cluster-out cross-validation to mimic the process of multiple single-study external validations. Each study represented one cluster. In each of these validations, we assessed discrimination with Harrell’s c-statistic and calibration with calibration plots, the intercepts, and the slopes. We used random effects meta-analysis to obtain the (reference) mean performance estimates and between-study heterogeneity (I2-statistic). The influence of case-mix variation on discriminative performance was assessed with the model-based c-statistic and we fitted a “membership model” to obtain a gross estimate of transportability. Results: Across 14 single-study external validations, model performance was highly variable. The mean c-statistic was 0.74 (95%CI 0.70–0.78, range 0.52–0.84, I2 = 0.92), the mean intercept was -0.06 (95%CI -0.37–0.24, range -1.40–0.75, I2 = 0.97), and the mean slope was 0.96 (95%CI 0.78–1.13, range 0.53–1.31, I2 = 0.90). The decrease in discriminative performance was attributable to case-mix variation, between-study heterogeneity, or a combination of both. Incidentally, we observed poor generalizability or transportability of the model. Conclusions: We demonstrate two potential pitfalls in the interpretation of model performance with single-study external validation. With single-study external validation. (1) model performance is highly variable and depends on the choice of validation data and (2) no insight is provided into generalizability or transportability of the model that is needed to guide local implementation. As such, a single single-study external validation can easily be misinterpreted and lead to a false appreciation of the clinical prediction model. Cross-validation is better equipped to address these pitfalls.</p

    Misinformation interventions decay rapidly without an immediate posttest

    Get PDF
    In recent years, many kinds of interventions have been developed that seek to reduce susceptibility to misinformation. In two preregistered longitudinal studies (N1 = 503, N2 = 673), we leverage two previously validated “inoculation” interventions (a video and a game) to address two important questions in misinformation interventions research: (1) whether displaying additional stimuli (such as videos unrelated to misinformation) alongside an intervention interferes with its effectiveness, and (2) whether administering an immediate posttest (in the form of a social media post evaluation task after the intervention) plays a role in the longevity of the intervention. We find no evidence that other stimuli interfere with intervention efficacy, but strong evidence that immediate posttests strengthen the learnings from the intervention. In study 1, we find that 48 h after watching a video, participants who received an immediate posttest continued to be significantly better at discerning untrustworthy social media posts from neutral ones than the control group (d = 0.416, p = .007), whereas participants who only received a posttest 48 h later showed no differences with a control (d = 0.010, p = .854). In study 2, we observe highly similar results for a gamified intervention, and provide evidence for a causal mechanism: immediate posttests help strengthen people's memory of the lessons learned in the intervention. We argue that the active rehearsal and application of relevant information are therefore requirements for the longevity of learning‐based misinformation interventions, which has substantial implications for their scalability
    • 

    corecore