2 research outputs found

    Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies

    No full text
    Although randomized controlled trials (RCTs) are the gold standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data has been vital in postapproval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of real-world data is electronic health records (EHRs), which contain detailed information on patient care in both structured (eg, diagnosis codes) and unstructured (eg, clinical notes and images) forms. Despite the granularity of the data available in EHRs, the critical variables required to reliably assess the relationship between a treatment and clinical outcome are challenging to extract. To address this fundamental challenge and accelerate the reliable use of EHRs for RWE, we introduce an integrated data curation and modeling pipeline consisting of 4 modules that leverage recent advances in natural language processing, computational phenotyping, and causal modeling techniques with noisy data. Module 1 consists of techniques for data harmonization. We use natural language processing to recognize clinical variables from RCT design documents and map the extracted variables to EHR features with description matching and knowledge networks. Module 2 then develops techniques for cohort construction using advanced phenotyping algorithms to both identify patients with diseases of interest and define the treatment arms. Module 3 introduces methods for variable curation, including a list of existing tools to extract baseline variables from different sources (eg, codified, free text, and medical imaging) and end points of various types (eg, death, binary, temporal, and numerical). Finally, module 4 presents validation and robust modeling methods, and we propose a strategy to create gold-standard labels for EHR variables of interest to validate data curation quality and perform subsequent causal modeling for RWE. In addition to the workflow proposed in our pipeline, we also develop a reporting guideline for RWE that covers the necessary information to facilitate transparent reporting and reproducibility of results. Moreover, our pipeline is highly data driven, enhancing study data with a rich variety of publicly available information and knowledge sources. We also showcase our pipeline and provide guidance on the deployment of relevant tools by revisiting the emulation of the Clinical Outcomes of Surgical Therapy Study Group Trial on laparoscopy-assisted colectomy versus open colectomy in patients with early-stage colon cancer. We also draw on existing literature on EHR emulation of RCTs together with our own studies with the Mass General Brigham EHR

    Clinical phenotypes and outcomes in children with multisystem inflammatory syndrome across SARS-CoV-2 variant eras: a multinational study from the 4CE consortiumResearch in context

    No full text
    Summary: Background: Multisystem inflammatory syndrome in children (MIS-C) is a severe complication of SARS-CoV-2 infection. It remains unclear how MIS-C phenotypes vary across SARS-CoV-2 variants. We aimed to investigate clinical characteristics and outcomes of MIS-C across SARS-CoV-2 eras. Methods: We performed a multicentre observational retrospective study including seven paediatric hospitals in four countries (France, Spain, U.K., and U.S.). All consecutive confirmed patients with MIS-C hospitalised between February 1st, 2020, and May 31st, 2022, were included. Electronic Health Records (EHR) data were used to calculate pooled risk differences (RD) and effect sizes (ES) at site level, using Alpha as reference. Meta-analysis was used to pool data across sites. Findings: Of 598 patients with MIS-C (61% male, 39% female; mean age 9.7 years [SD 4.5]), 383 (64%) were admitted in the Alpha era, 111 (19%) in the Delta era, and 104 (17%) in the Omicron era. Compared with patients admitted in the Alpha era, those admitted in the Delta era were younger (ES −1.18 years [95% CI −2.05, −0.32]), had fewer respiratory symptoms (RD −0.15 [95% CI −0.33, −0.04]), less frequent non-cardiogenic shock or systemic inflammatory response syndrome (SIRS) (RD −0.35 [95% CI −0.64, −0.07]), lower lymphocyte count (ES −0.16 × 109/uL [95% CI −0.30, −0.01]), lower C-reactive protein (ES −28.5 mg/L [95% CI −46.3, −10.7]), and lower troponin (ES −0.14 ng/mL [95% CI −0.26, −0.03]). Patients admitted in the Omicron versus Alpha eras were younger (ES −1.6 years [95% CI −2.5, −0.8]), had less frequent SIRS (RD −0.18 [95% CI −0.30, −0.05]), lower lymphocyte count (ES −0.39 × 109/uL [95% CI −0.52, −0.25]), lower troponin (ES −0.16 ng/mL [95% CI −0.30, −0.01]) and less frequently received anticoagulation therapy (RD −0.19 [95% CI −0.37, −0.04]). Length of hospitalization was shorter in the Delta versus Alpha eras (−1.3 days [95% CI −2.3, −0.4]). Interpretation: Our study suggested that MIS-C clinical phenotypes varied across SARS-CoV-2 eras, with patients in Delta and Omicron eras being younger and less sick. EHR data can be effectively leveraged to identify rare complications of pandemic diseases and their variation over time. Funding: None
    corecore