5 research outputs found

    Data quality considerations for evaluating COVID-19 treatments using real world data: learnings from the National COVID Cohort Collaborative (N3C)

    Get PDF
    Background: Multi-institution electronic health records (EHR) are a rich source of real world data (RWD) for generating real world evidence (RWE) regarding the utilization, benefits and harms of medical interventions. They provide access to clinical data from large pooled patient populations in addition to laboratory measurements unavailable in insurance claims-based data. However, secondary use of these data for research requires specialized knowledge and careful evaluation of data quality and completeness. We discuss data quality assessments undertaken during the conduct of prep-to-research, focusing on the investigation of treatment safety and effectiveness. Methods: Using the National COVID Cohort Collaborative (N3C) enclave, we defined a patient population using criteria typical in non-interventional inpatient drug effectiveness studies. We present the challenges encountered when constructing this dataset, beginning with an examination of data quality across data partners. We then discuss the methods and best practices used to operationalize several important study elements: exposure to treatment, baseline health comorbidities, and key outcomes of interest. Results: We share our experiences and lessons learned when working with heterogeneous EHR data from over 65 healthcare institutions and 4 common data models. We discuss six key areas of data variability and quality. (1) The specific EHR data elements captured from a site can vary depending on source data model and practice. (2) Data missingness remains a significant issue. (3) Drug exposures can be recorded at different levels and may not contain route of administration or dosage information. (4) Reconstruction of continuous drug exposure intervals may not always be possible. (5) EHR discontinuity is a major concern for capturing history of prior treatment and comorbidities. Lastly, (6) access to EHR data alone limits the potential outcomes which can be used in studies. Conclusions: The creation of large scale centralized multi-site EHR databases such as N3C enables a wide range of research aimed at better understanding treatments and health impacts of many conditions including COVID-19. As with all observational research, it is important that research teams engage with appropriate domain experts to understand the data in order to define research questions that are both clinically important and feasible to address using these real world data

    Characteristics, Outcomes, and Severity Risk Factors Associated with SARS-CoV-2 Infection among Children in the US National COVID Cohort Collaborative

    Get PDF
    Importance: Understanding of SARS-CoV-2 infection in US children has been limited by the lack of large, multicenter studies with granular data. Objective: To examine the characteristics, changes over time, outcomes, and severity risk factors of children with SARS-CoV-2 within the National COVID Cohort Collaborative (N3C). Design, Setting, and Participants: A prospective cohort study of encounters with end dates before September 24, 2021, was conducted at 56 N3C facilities throughout the US. Participants included children younger than 19 years at initial SARS-CoV-2 testing. Main Outcomes and Measures: Case incidence and severity over time, demographic and comorbidity severity risk factors, vital sign and laboratory trajectories, clinical outcomes, and acute COVID-19 vs multisystem inflammatory syndrome in children (MIS-C), and Delta vs pre-Delta variant differences for children with SARS-CoV-2. Results: A total of 1068410 children were tested for SARS-CoV-2 and 167262 test results (15.6%) were positive (82882 [49.6%] girls; median age, 11.9 [IQR, 6.0-16.1] years). Among the 10245 children (6.1%) who were hospitalized, 1423 (13.9%) met the criteria for severe disease: mechanical ventilation (796 [7.8%]), vasopressor-inotropic support (868 [8.5%]), extracorporeal membrane oxygenation (42 [0.4%]), or death (131 [1.3%]). Male sex (odds ratio [OR], 1.37; 95% CI, 1.21-1.56), Black/African American race (OR, 1.25; 95% CI, 1.06-1.47), obesity (OR, 1.19; 95% CI, 1.01-1.41), and several pediatric complex chronic condition (PCCC) subcategories were associated with higher severity disease. Vital signs and many laboratory test values from the day of admission were predictive of peak disease severity. Variables associated with increased odds for MIS-C vs acute COVID-19 included male sex (OR, 1.59; 95% CI, 1.33-1.90), Black/African American race (OR, 1.44; 95% CI, 1.17-1.77), younger than 12 years (OR, 1.81; 95% CI, 1.51-2.18), obesity (OR, 1.76; 95% CI, 1.40-2.22), and not having a pediatric complex chronic condition (OR, 0.72; 95% CI, 0.65-0.80). The children with MIS-C had a more inflammatory laboratory profile and severe clinical phenotype, with higher rates of invasive ventilation (117 of 707 [16.5%] vs 514 of 8241 [6.2%]; P <.001) and need for vasoactive-inotropic support (191 of 707 [27.0%] vs 426 of 8241 [5.2%]; P <.001) compared with those who had acute COVID-19. Comparing children during the Delta vs pre-Delta eras, there was no significant change in hospitalization rate (1738 [6.0%] vs 8507 [6.2%]; P =.18) and lower odds for severe disease (179 [10.3%] vs 1242 [14.6%]) (decreased by a factor of 0.67; 95% CI, 0.57-0.79; P <.001). Conclusions and Relevance: In this cohort study of US children with SARS-CoV-2, there were observed differences in demographic characteristics, preexisting comorbidities, and initial vital sign and laboratory values between severity subgroups. Taken together, these results suggest that early identification of children likely to progress to severe disease could be achieved using readily available data elements from the day of admission. Further work is needed to translate this knowledge into improved outcomes
    corecore