5 research outputs found

    Comparison of machine learning methods for estimating case fatality ratios: an Ebola outbreak simulation study

    Get PDF
    Background Machine learning (ML) algorithms are now increasingly used in infectious disease epidemiology. Epidemiologists should understand how ML algorithms behave within the context of outbreak data where missingness of data is almost ubiquitous. Methods Using simulated data, we use a ML algorithmic framework to evaluate data imputation performance and the resulting case fatality ratio (CFR) estimates, focusing on the scale and type of data missingness (i.e., missing completely at random—MCAR, missing at random—MAR, or missing not at random—MNAR). Results Across ML methods, dataset sizes and proportions of training data used, the area under the receiver operating characteristic curve decreased by 7% (median, range: 1%–16%) when missingness was increased from 10% to 40%. Overall reduction in CFR bias for MAR across methods, proportion of missingness, outbreak size and proportion of training data was 0.5% (median, range: 0%–11%). Conclusion ML methods could reduce bias and increase the precision in CFR estimates at low levels of missingness. However, no method is robust to high percentages of missingness. Thus, a datacentric approach is recommended in outbreak settings—patient survival outcome data should be prioritised for collection and random-sample follow-ups should be implemented to ascertain missing outcomes

    Marburg virus disease outbreaks, mathematical models, and disease parameters: a systematic review

    Get PDF
    The 2023 Marburg virus disease outbreaks in Equatorial Guinea and Tanzania highlighted the importance of better understanding this lethal pathogen. We did a systematic review (PROSPERO CRD42023393345) of peer-reviewed articles reporting historical outbreaks, modelling studies, and epidemiological parameters focused on Marburg virus disease. We searched PubMed and Web of Science from database inception to March 31, 2023. Two reviewers evaluated all titles and abstracts with consensus-based decision making. To ensure agreement, 13 (31%) of 42 studies were double-extracted and a custom-designed quality assessment questionnaire was used for risk of bias assessment. We present detailed information on 478 reported cases and 385 deaths from Marburg virus disease. Analysis of historical outbreaks and seroprevalence estimates suggests the possibility of undetected Marburg virus disease outbreaks, asymptomatic transmission, or cross-reactivity with other pathogens, or a combination of these. Only one study presented a mathematical model of Marburg virus transmission. We estimate an unadjusted, pooled total random effect case fatality ratio of 61·9% (95% CI 38·8–80·6; I2=93%). We identify epidemiological parameters relating to transmission and natural history, for which there are few estimates. This systematic review and the accompanying database provide a comprehensive overview of Marburg virus disease epidemiology and identify key knowledge gaps, contributing crucial information for mathematical models to support future Marburg virus disease epidemic responses

    Subnational analysis and modelling of the Ebola epidemic in West Africa, 2013-2016: application of machine learning algorithms for case fatality imputation

    No full text
    Background The 2013-2016 West African Ebola epidemic has been the largest to date with more than 11,000 deaths in the affected countries. The data collected have provided more insight into the case fatality ratio (CFR) and how it varies with age and other characteristics. However, the accuracy and precision of the naïve CFR remain limited because 44% of survival outcomes were unreported. Methods Using a machine learning (MaLe) model, Boosted Regression Tree (BRT), I imputed survival outcomes (i.e. survival or death) when unreported, corrected for model imperfection to estimate the CFR without imputation, with imputation and adjusted with imputation. I used semivariogram analysis and kriging to investigate subnational heterogeneities in CFR estimates. I used simulations to evaluate the performance of various MaLe inference methods for the estimation of CFR under different outbreak data scenarios. Results The adjusted CFR estimates were 82.8% (95% CI 45%.6-85.6%) overall and 89.1% (95% CI 40.8%-91.6%), 65.6% (95% CI 61.3%-69.6%) and 79.2% (95% CI 45.4%-84.1%) for Sierra Leone, Guinea and Liberia, respectively. BRT modelling accounted for most of the spatiotemporal variation and interactions in CFR, but moderate spatial autocorrelation remained. Combining district-level CFR estimates and kriged district-level residuals provided the best linear unbiased map of CFR. Temporal autocorrelation was not observed in the district-level residuals from the BRT estimates. Finally, I observed that the performance of MaLe inference methods for CFR imputation varies under different outbreak data scenarios. Conclusions Adjusted CFR estimates improved the naïve CFR estimates obtained without imputation and were more representative. Used in conjunction with other resources, adjusted CFR estimates and the unbiased CFR maps will inform future public health response to Ebola outbreaks. I confirm that, across the board, data imputation with adjustment for the sensitivity and specificity of MaLe inference methods reduces the bias in CFR estimates.Open Acces

    Case fatality imputation using machine learning

    No full text
    Non UBCUnreviewedAuthor affiliation: SFUPostdoctora

    Ebola Virus Disease mathematical models and epidemiological parameters: a systematic review

    No full text
    Ebola Virus Disease (EVD) poses a recurring risk to human health. We conducted a systematic review (PROSPERO CRD42023393345) of EVD transmission models and parameters published prior to 7th July 2023 from PubMed and Web of Science. Two people screened each abstract and full text. Papers were extracted using a bespoke Access database, 10% were double extracted. We extracted 1,280 parameters and 295 models from 522 papers. Basic reproduction number estimates were highly variable as were effective reproduction numbers, likely reflecting spatiotemporal variability in interventions. Random effect estimates were 15.4 days (95% Confidence Interval (CI) 13.2-17.5) for the serial interval, 8.5 (95% CI 7.7-9.2) for the incubation period, 9.3 (95% CI 8.5-10.1) for the symptom-onset-to-death delay and 13.0 (95% CI 10.4-15.7) for symptom-onset-to-recovery. Common effect estimates were similar albeit with narrower CIs. Case fatality ratio estimates were generally high but highly variable, which could reflect heterogeneity in underlying risk factors. While a significant body of literature exists on EVD models and epidemiological parameter estimates, many of these studies focus on the West African Ebola epidemic and are primarily associated with Zaire Ebola virus. This leaves a critical gap in our knowledgeregarding other Ebola virus species and outbreak contexts
    corecore