37 research outputs found

    Development of artificial neural network models for paediatric critical illness in South Africa

    Get PDF
    OBJECTIVES: Failures in identification, resuscitation and appropriate referral have been identified as significant contributors to avoidable severity of illness and mortality in South African children. In this study, artificial neural network models were developed to predict a composite outcome of death before discharge from hospital or admission to the PICU. These models were compared to logistic regression and XGBoost models developed on the same data in cross-validation. DESIGN: Prospective, analytical cohort study. SETTING: A single centre tertiary hospital in South Africa providing acute paediatric services. PATIENTS: Children, under the age of 13 years presenting to the Paediatric Referral Area for acute consultations. OUTCOMES: Predictive models for a composite outcome of death before discharge from hospital or admission to the PICU. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: 765 patients were included in the data set with 116 instances (15.2%) of the study outcome. Models were developed on three sets of features. Two derived from sequential floating feature selection (one inclusive, one parsimonious) and one from the Akaike information criterion to yield 9 models. All developed models demonstrated good discrimination on cross-validation with mean ROC AUCs greater than 0.8 and mean PRC AUCs greater than 0.53. ANN1, developed on the inclusive feature-et demonstrated the best discrimination with a ROC AUC of 0.84 and a PRC AUC of 0.64 Model calibration was variable, with most models demonstrating weak calibration. Decision curve analysis demonstrated that all models were superior to baseline strategies, with ANN1 demonstrating the highest net benefit. CONCLUSIONS: All models demonstrated satisfactory performance, with the best performing model in cross-validation being an ANN model. Given the good performance of less complex models, however, these models should also be considered, given their advantage in ease of implementation in practice. An internal validation study is now being conducted to further assess performance with a view to external validation

    Elicitation of domain knowledge for a machine learning model for paediatric critical illness in South Africa

    Get PDF
    OBJECTIVES: Delays in identification, resuscitation and referral have been identified as a preventable cause of avoidable severity of illness and mortality in South African children. To address this problem, a machine learning model to predict a compound outcome of death prior to discharge from hospital and/or admission to the PICU was developed. A key aspect of developing machine learning models is the integration of human knowledge in their development. The objective of this study is to describe how this domain knowledge was elicited, including the use of a documented literature search and Delphi procedure. DESIGN: A prospective mixed methodology development study was conducted that included qualitative aspects in the elicitation of domain knowledge, together with descriptive and analytical quantitative and machine learning methodologies. SETTING: A single centre tertiary hospital providing acute paediatric services. PARTICIPANTS: Three paediatric intensivists, six specialist paediatricians and three specialist anaesthesiologists. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: The literature search identified 154 full-text articles reporting risk factors for mortality in hospitalised children. These factors were most commonly features of specific organ dysfunction. 89 of these publications studied children in lower- and middle-income countries. The Delphi procedure included 12 expert participants and was conducted over 3 rounds. Respondents identified a need to achieve a compromise between model performance, comprehensiveness and veracity and practicality of use. Participants achieved consensus on a range of clinical features associated with severe illness in children. No special investigations were considered for inclusion in the model except point-of-care capillary blood glucose testing. The results were integrated by the researcher and a final list of features was compiled. CONCLUSION: The elicitation of domain knowledge is important in effective machine learning applications. The documentation of this process enhances rigour in such models and should be reported in publications. A documented literature search, Delphi procedure and the integration of the domain knowledge of the researchers contributed to problem specification and selection of features prior to feature engineering, pre-processing and model development

    Knowledge and perceptions of brucellosis in the pastoral communities adjacent to Lake Mburo National Park, Uganda

    Get PDF
    BACKGROUND: Brucellosis is one of the most common zoonotic infections globally. Lack of knowledge about brucellosis may affect the health-seeking behavior of patients, thus leading to sustained transmission in these communities. Our study assessed knowledge and perceptions of brucellosis among pastoral communities adjacent to Lake Mburo National Park (LMNP), Kiruhura District, Uganda. METHODS: A community cross-sectional questionnaire survey involving 371 randomly selected household heads from three sub-counties neighboring LMNP were interviewed between June and August 2012. Data collected included communities’ knowledge on causes, symptoms, transmission, treatment, prevention and risk factors of brucellosis. Multivariable logistic regression analysis was performed to explore strength of association between overall knowledge of brucellosis and various individual factors using odds ratios and 95% confidence intervals. RESULTS: Only 70 (19%) knew the symptoms of brucellosis in animals, and three quarters (279, 75.5%) mentioned joint and muscle pain as a common symptom in humans. Almost all participants (370, 99.3%) had ever heard about brucellosis, majority (311, 84.7%) believed it affects all sexes and two thirds (67.7%) of the respondents believed close proximity to wildlife contributes to the presence of the disease. Almost all (352, 95.4%) knew that brucellosis in humans could be treatable using modern drugs. The main routes of infection in humans such as consumption of unpasteurized dairy products were known by 97% (360/371); eating of half-cooked meat by 91.4% and eating contaminated pasture in animals by 97.4%. There was moderate overall knowledge of brucellosis 197 (53.1%). Factors associated with higher overall knowledge were being agro-pastoralists (aOR: 2.08, CI: 1.17-3.71) compared to pure pastoralists while those who reported that the disease was a health problem (aOR: 0.18, CI: 0.06-0.56) compared to those who said it was not were less likely to be knowledgeable. CONCLUSIONS: There was moderate overall knowledge of human and animal brucellosis among the participants. Majority of the participants believed that close proximity to wildlife contributes to the presence of the disease in the area. There is a need for collaboration between the public health, veterinary and wildlife sectors to provide health education on brucellosis for better management of the disease in the communities

    Systematic review of statistically-derived models of immunological response in HIV-infected adults on antiretroviral therapy in Sub-Saharan Africa

    Get PDF
    CITATION: Sempa, J. B., Ujeneza, E. L. & Niewoudt, M. 2017. Systematic review of statistically-derived models of immunological response in HIV-infected adults on antiretroviral therapy in Sub-Saharan Africa. PLoS ONE, 12(2):e0171658, doi:10.1371/journal.pone.0171658.The original publication is available at http://journals.plos.org/plosoneIntroduction: In Sub-Saharan African (SSA) resource limited settings, Cluster of Differentiation 4 (CD4) counts continue to be used for clinical decision making in antiretroviral therapy (ART). Here, HIV-infected people often remain with CD4 counts <350 cells/μL even after 5 years of viral load suppression. Ongoing immunological monitoring is necessary. Due to varying statistical modeling methods comparing immune response to ART across different cohorts is difficult. We systematically review such models and detail the similarities, differences and problems. Methods: ‘Preferred Reporting Items for Systematic Review and Meta-Analyses’ guidelines were used. Only studies of immune-response after ART initiation from SSA in adults were included. Data was extracted from each study and tabulated. Outcomes were categorized into 3 groups: ‘slope’, ‘survival’, and ‘asymptote’ models. Wordclouds were drawn wherein the frequency of variables occurring in the reviewed models is indicated by their size and color. Results: 69 covariates were identified in the final models of 35 studies. Effect sizes of covariates were not directly quantitatively comparable in view of the combination of differing variables and scale transformation methods across models. Wordclouds enabled the identification of qualitative and semi-quantitative covariate sets for each outcome category. Comparison across categories identified sex, baseline age, baseline log viral load, baseline CD4, ART initiation regimen and ART duration as a minimal consensus set. Conclusion: Most models were different with respect to covariates included, variable transformations and scales, model assumptions, modelling strategies and reporting methods, even for the same outcomes. To enable comparison across cohorts, statistical models would benefit from the application of more uniform modelling techniques. Historic efforts have produced results that are anecdotal to individual cohorts only. This study was able to define ‘prior’ knowledge in the Bayesian sense. Such information has value for prospective modelling efforts.http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0171658Publisher's versio

    Cumulative viral load as a predictor of CD4+T-cell response to antiretroviral therapy using Bayesian statistical models

    No full text
    INTRODUCTION: There are Challenges in statistically modelling immune responses to longitudinal HIV viral load exposure as a function of covariates. We define Bayesian Markov Chain Monte Carlo mixed effects models to incorporate priors and examine the effect of different distributional assumptions. We prospectively fit these models to an as-yet-unpublished data from the Tshwane District Hospital HIV treatment clinic in South Africa, to determine if cumulative log viral load, an indicator of long-term viral exposure, is a valid predictor of immune response. METHODS: Models are defined, to express 'slope', i.e. mean annual increase in CD4 counts, and 'asymptote', i.e. the odds of having a CD4 count ≥500 cells/μL during antiretroviral treatment, as a function of covariates and random-effects. We compare the effect of using informative versus non-informative prior distributions on model parameters. Models with cubic splines or Skew-normal distributions are also compared using the conditional Deviance Information Criterion. RESULTS: The data of 750 patients are analyzed. Overall, models adjusting for cumulative log viral load provide a significantly better fit than those that do not. An increase in cumulative log viral load is associated with a decrease in CD4 count slope (19.6 cells/μL (95% credible interval: 28.26, 10.93)) and a reduction in the odds of achieving a CD4 counts ≥500 cells/μL (0.42 (95% CI: 0.236, 0.730)) during 5 years of therapy. Using informative priors improves the cumulative log viral load estimate, and a skew-normal distribution for the random-intercept and measurement error results is a better fit compared to using classical Gaussian distributions. DISCUSSION: We demonstrate in an unpublished South African cohort that cumulative log viral load is a strong and significant predictor of both CD4 count slope and asymptote. We argue that Bayesian methods should be used more frequently for such data, given their flexibility to incorporate prior information and non-Gaussian distributions.status: publishe

    Cumulative viral load as a predictor of CD4+ T-cell response to antiretroviral therapy using Bayesian statistical models (vol 14, e0224723, 2019)

    No full text
    [This corrects the article DOI: 10.1371/journal.pone.0224723.].status: publishe

    Cumulative viral load as a predictor of CD4+ Tcell response to antiretroviral therapy using Bayesian statistical models

    Get PDF
    CITATION: Sempa, J. B. et al. 2019. Cumulative viral load as a predictor of CD4+ Tcell response to antiretroviral therapy using Bayesian statistical models. PLoS ONE, 14(11): e0224723, doi:10.1371/journal.pone.0224723.The original publication is available at https://journals.plos.org/plosoneIntroduction: There are Challenges in statistically modelling immune responses to longitudinal HIV viral load exposure as a function of covariates. We define Bayesian Markov Chain Monte Carlo mixed effects models to incorporate priors and examine the effect of different distributional assumptions. We prospectively fit these models to an as-yet-unpublished data from the Tshwane District Hospital HIV treatment clinic in South Africa, to determine if cumulative log viral load, an indicator of long-term viral exposure, is a valid predictor of immune response. Methods: Models are defined, to express ‘slope’, i.e. mean annual increase in CD4 counts, and ‘asymptote’, i.e. the odds of having a CD4 count ≥500 cells/μL during antiretroviral treatment, as a function of covariates and random-effects. We compare the effect of using informative versus non-informative prior distributions on model parameters. Models with cubic splines or Skew-normal distributions are also compared using the conditional Deviance Information Criterion. Results: The data of 750 patients are analyzed. Overall, models adjusting for cumulative log viral load provide a significantly better fit than those that do not. An increase in cumulative log viral load is associated with a decrease in CD4 count slope (19.6 cells/μL (95% credible interval: 28.26, 10.93)) and a reduction in the odds of achieving a CD4 counts ≥500 cells/μL (0.42 (95% CI: 0.236, 0.730)) during 5 years of therapy. Using informative priors improves the cumulative log viral load estimate, and a skew-normal distribution for the random-intercept and measurement error results is a better fit compared to using classical Gaussian distributions. Discussion: We demonstrate in an unpublished South African cohort that cumulative log viral load is a strong and significant predictor of both CD4 count slope and asymptote. We argue that Bayesian methods should be used more frequently for such data, given their flexibility to incorporate prior information and non-Gaussian distributions.South African Department of Science and TechnologyNational Research FoundationStellenbosch Universityhttps://journals.plos.org/plosone/article?id=10.1371/journal.pone.0224723Publisher's versio

    Systematic review of statistically-derived models of immunological response in HIV-infected adults on antiretroviral therapy in Sub-Saharan Africa

    Get PDF
    <div><p>Introduction</p><p>In Sub-Saharan African (SSA) resource limited settings, Cluster of Differentiation 4 (CD4) counts continue to be used for clinical decision making in antiretroviral therapy (ART). Here, HIV-infected people often remain with CD4 counts <350 cells/μL even after 5 years of viral load suppression. Ongoing immunological monitoring is necessary. Due to varying statistical modeling methods comparing immune response to ART across different cohorts is difficult. We systematically review such models and detail the similarities, differences and problems.</p><p>Methods</p><p>‘Preferred Reporting Items for Systematic Review and Meta-Analyses’ guidelines were used. Only studies of immune-response after ART initiation from SSA in adults were included. Data was extracted from each study and tabulated. Outcomes were categorized into 3 groups: ‘slope’, ‘survival’, and ‘asymptote’ models. Wordclouds were drawn wherein the frequency of variables occurring in the reviewed models is indicated by their size and color.</p><p>Results</p><p>69 covariates were identified in the final models of 35 studies. Effect sizes of covariates were not directly quantitatively comparable in view of the combination of differing variables and scale transformation methods across models. Wordclouds enabled the identification of qualitative and semi-quantitative covariate sets for each outcome category. Comparison across categories identified sex, baseline age, baseline log viral load, baseline CD4, ART initiation regimen and ART duration as a minimal consensus set.</p><p>Conclusion</p><p>Most models were different with respect to covariates included, variable transformations and scales, model assumptions, modelling strategies and reporting methods, even for the same outcomes. To enable comparison across cohorts, statistical models would benefit from the application of more uniform modelling techniques. Historic efforts have produced results that are anecdotal to individual cohorts only. This study was able to define ‘prior’ knowledge in the Bayesian sense. Such information has value for prospective modelling efforts.</p></div
    corecore