14 research outputs found

    A Critical Evaluation of Tracking Public Opinion with Social Media: A Case Study in Presidential Approval

    Get PDF
    There has been much interest in using social media to track public opinion. We introduce a higher level of scrutiny to these types of analyses, specifically looking at the relationship between presidential approval and "Trump" tweets and developing a framework to interpret its strength. We use placebo analyses, performing the same analysis but with tweets assumed to be unrelated to presidential approval, to assess the relationship and conclude that the relationship is less strong than it might otherwise seem. Secondly, we suggest following users longitudinally, which enables us to find evidence of a political signal around the 2016 presidential election. For the goal of supplementing traditional surveys with social media data, our results are encouraging, but cautionary

    Derivation and external validation of a simple risk score to predict in-hospital mortality in patients hospitalized for COVID-19: A multicenter retrospective cohort study

    Get PDF
    ABSTRACT: As severe acute respiratory syndrome coronavirus 2 continues to spread, easy-to-use risk models that predict hospital mortality can assist in clinical decision making and triage. We aimed to develop a risk score model for in-hospital mortality in patients hospitalized with 2019 novel coronavirus (COVID-19) that was robust across hospitals and used clinical factors that are readily available and measured standardly across hospitals. In this retrospective observational study, we developed a risk score model using data collected by trained abstractors for patients in 20 diverse hospitals across the state of Michigan (Mi-COVID19) who were discharged between March 5, 2020 and August 14, 2020. Patients who tested positive for severe acute respiratory syndrome coronavirus 2 during hospitalization or were discharged with an ICD-10 code for COVID-19 (U07.1) were included. We employed an iterative forward selection approach to consider the inclusion of 145 potential risk factors available at hospital presentation. Model performance was externally validated with patients from 19 hospitals in the Mi-COVID19 registry not used in model development. We shared the model in an easy-to-use online application that allows the user to predict in-hospital mortality risk for a patient if they have any subset of the variables in the final model. Two thousand one hundred and ninety-three patients in the Mi-COVID19 registry met our inclusion criteria. The derivation and validation sets ultimately included 1690 and 398 patients, respectively, with mortality rates of 19.6% and 18.6%, respectively. The average age of participants in the study after exclusions was 64 years old, and the participants were 48% female, 49% Black, and 87% non-Hispanic. Our final model includes the patient\u27s age, first recorded respiratory rate, first recorded pulse oximetry, highest creatinine level on day of presentation, and hospital\u27s COVID-19 mortality rate. No other factors showed sufficient incremental model improvement to warrant inclusion. The area under the receiver operating characteristics curve for the derivation and validation sets were .796 (95% confidence interval, .767-.826) and .829 (95% confidence interval, .782-.876) respectively. We conclude that the risk of in-hospital mortality in COVID-19 patients can be reliably estimated using a few factors, which are standardly measured and available to physicians very early in a hospital encounter

    Precise unbiased estimation in randomized experiments using auxiliary observational data

    No full text
    Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially
    corecore