9 research outputs found

    Bacterial whole genome-based phylogeny: construction of a new benchmarking dataset and assessment of some existing methods

    Get PDF
    BackgroundWhole genome sequencing (WGS) is increasingly used in diagnostics and surveillance of infectious diseases. A major application for WGS is to use the data for identifying outbreak clusters, and there is therefore a need for methods that can accurately and efficiently infer phylogenies from sequencing reads. In the present study we describe a new dataset that we have created for the purpose of benchmarking such WGS-based methods for epidemiological data, and also present an analysis where we use the data to compare the performance of some current methods.ResultsOur aim was to create a benchmark data set that mimics sequencing data of the sort that might be collected during an outbreak of an infectious disease. This was achieved by letting an E. coli hypermutator strain grow in the lab for 8 consecutive days, each day splitting the culture in two while also collecting samples for sequencing. The result is a data set consisting of 101 whole genome sequences with known phylogenetic relationship. Among the sequenced samples 51 correspond to internal nodes in the phylogeny because they are ancestral, while the remaining 50 correspond to leaves.We also used the newly created data set to compare three different online available methods that infer phylogenies from whole-genome sequencing reads: NDtree, CSI Phylogeny and REALPHY. One complication when comparing the output of these methods with the known phylogeny is that phylogenetic methods typically build trees where all observed sequences are placed as leafs, even though some of them are in fact ancestral. We therefore devised a method for post processing the inferred trees by collapsing short branches (thus relocating some leafs to internal nodes), and also present two new measures of tree similarity that takes into account the identity of both internal and leaf nodes.ConclusionsBased on this analysis we find that, among the investigated methods, CSI Phylogeny had the best performance, correctly identifying 73% of all branches in the tree and 71% of all clades.We have made all data from this experiment (raw sequencing reads, consensus whole-genome sequences, as well as descriptions of the known phylogeny in a variety of formats) publicly available, with the hope that other groups may find this data useful for benchmarking and exploring the performance of epidemiological methods. All data is freely available at: https://cge.cbs.dtu.dk/services/evolution_data.php

    An interactive database for the investigation of high-density peptide microarray guided interaction patterns and antivenom cross-reactivity

    Get PDF
    Snakebite envenoming is a major neglected tropical disease that affects millions of people every year. The only effective treatment against snakebite envenoming consists of unspecified cocktails of polyclonal antibodies purified from the plasma of immunized production animals. Currently, little data exists on the molecular interactions between venom-toxin epitopes and antivenom-antibody paratopes. To address this issue, high-density peptide microarray (hdpm) technology has recently been adapted to the field of toxinology. However, analysis of such valuable datasets requires expert understanding and, thus, complicates its broad application within the field. In the present study, we developed a user-friendly, and high-throughput web application named “Snake Toxin and Antivenom Binding Profiles” (STAB Profiles), to allow straight-forward analysis of hdpm datasets. To test our tool and evaluate its performance with a large dataset, we conducted hdpm assays using all African snake toxin protein sequences available in the UniProt database at the time of study design, together with eight commercial antivenoms in clinical use in Africa, thus representing the largest venom-antivenom dataset to date. Furthermore, we introduced a novel method for evaluating raw signals from a peptide microarray experiment and a data normalization protocol enabling intra-microarray and even inter-microarray chip comparisons. Finally, these data, alongside all the data from previous similar studies by Engmark et al., were preprocessed according to our newly developed protocol and made publicly available for download through the STAB Profiles web application (http://tropicalpharmacology.com/tools/stab-profiles/). With these data and our tool, we were able to gain key insights into toxin-antivenom interactions and were able to differentiate the ability of different antivenoms to interact with certain toxins of interest. The data, as well as the web application, we present in this article should be of significant value to the venom-antivenom research community. Knowledge gained from our current and future analyses of this dataset carry the potential to guide the improvement and optimization of current antivenoms for maximum patient benefit, as well as aid the development of next-generation antivenoms.UCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias de la Salud::Instituto Clodomiro Picado (ICP

    Estimating pesticides in public drinking water at the household level in Denmark

    Get PDF
    Pesticide pollution has raised public concern in Denmark due to potential negative health impacts and frequent findings of new substances after a recent expansion of the groundwater monitoring programme. Danish drinking water comes entirely from groundwater. Both the raw groundwater and the treated drinking water are regularly monitored, and the chemical analyses are reported to a publicly available national database (Jupiter). Based on these data, in this study we (1) provide a status of pesticide content in drinking water supplied by public waterworks in Denmark and (2) assess the proportion of Danish households exposed to pesticides from drinking water. ‘Pesticides’ here refers also to their metabolites, degradation and reaction products. The cleaned dataset represents 3004 public waterworks distributed throughout the country and includes 39 798 samples of treated drinking water analysed for 449 pesticides (971 723 analyses total) for the period 2002–2019. Of all these chemical analyses, 0.5% (n = 4925) contained a quantified pesticide (>0.03 μg/l). Pesticides were found at least once in the treated drinking water at 29% of all sampled public waterworks for the period 2002–2019 and at 21% of the waterworks for the recent period 2015–2019. We estimate that 56% of all Danish households were potentially exposed at least once to pesticides in drinking water at concentrations of 0.03–4.00 μg/l between 2002 and 2019. However, in 2015–2019, the proportion of the Danish households exposed to pesticides (0.03–4.00 μg/l) was 41%. The proportion of Danish households potentially exposed at least once to pesticides above the maximum allowed concentration (0.1 μg/l) according to the EU Drinking Water Directive (and the Danish drinking water standard) was 19% for 2002–2019 and 11% for 2015–2019. However, the maximum concentrations were lower than the World Health Organization’s compound-specific guidelines. Lastly, we explore data complexity and discuss the limitations imposed by data heterogeneity to facilitate future epidemiological studies

    Geographical Distribution and Pattern of Pesticides in Danish Drinking Water 2002–2018: Reducing Data Complexity

    No full text
    Pesticides are a large and heterogenous group of chemicals with a complex geographic distribution in the environment. The purpose of this study was to explore the geographic distribution of pesticides in Danish drinking water and identify potential patterns in the grouping of pesticides. Our data included 899,169 analyses of 167 pesticides and metabolites, of which 55 were identified above the detection limit. Pesticide patterns were defined by (1) pesticide groups based on chemical structure and pesticide–metabolite relations and (2) an exploratory factor analysis identifying underlying patterns of related pesticides within waterworks. The geographic distribution was evaluated by mapping the pesticide categories for groups and factor components, namely those detected, quantified, above quality standards, and not analysed. We identified five and seven factor components for the periods 2002–2011 and 2012–2018, respectively. In total, 16 pesticide groups were identified, of which six were representative in space and time with regards to the number of waterworks and analyses, namely benzothiazinone, benzonitriles, organophosphates, phenoxy herbicides, triazines, and triazinones. Pesticide mapping identified areas where multiple pesticides were detected, indicating areas with a higher pesticide burden. The results contribute to a better understanding of the pesticide pattern in Danish drinking water and may contribute to exposure assessments for future epidemiological studies

    An interactive database for the investigation of high-density peptide microarray guided interaction patterns and antivenom cross-reactivity

    No full text
    Snakebite envenoming is a major neglected tropical disease that affects millions of people every year. The only effective treatment against snakebite envenoming consists of unspecified cocktails of polyclonal antibodies purified from the plasma of immunized production animals. Currently, little data exists on the molecular interactions between venom-toxin epitopes and antivenom-antibody paratopes. To address this issue, high-density peptide microarray (hdpm) technology has recently been adapted to the field of toxinology. However, analysis of such valuable datasets requires expert understanding and, thus, complicates its broad application within the field. In the present study, we developed a user-friendly, and high-throughput web application named "Snake Toxin and Antivenom Binding Profiles" (STAB Profiles), to allow straight-forward analysis of hdpm datasets. To test our tool and evaluate its performance with a large dataset, we conducted hdpm assays using all African snake toxin protein sequences available in the UniProt database at the time of study design, together with eight commercial antivenoms in clinical use in Africa, thus representing the largest venom-antivenom dataset to date. Furthermore, we introduced a novel method for evaluating raw signals from a peptide microarray experiment and a data normalization protocol enabling intra-microarray and even inter-microarray chip comparisons. Finally, these data, alongside all the data from previous similar studies by Engmark et al., were preprocessed according to our newly developed protocol and made publicly available for download through the STAB Profiles web application (http://tropicalpharmacology.com/tools/stab-profiles/). With these data and our tool, we were able to gain key insights into toxin-antivenom interactions and were able to differentiate the ability of different antivenoms to interact with certain toxins of interest. The data, as well as the web application, we present in this article should be of significant value to the venom-antivenom research community. Knowledge gained from our current and future analyses of this dataset carry the potential to guide the improvement and optimization of current antivenoms for maximum patient benefit, as well as aid the development of next-generation antivenoms

    Identifying Risk of Adverse Outcomes in COVID-19 Patients via Artificial Intelligence-Powered Analysis of 12-Lead Intake Electrocardiogram.

    No full text
    Background: Adverse events in COVID-19 are difficult to predict. Risk stratification is encumbered by the need to protect healthcare workers. We hypothesize that AI can help identify subtle signs of myocardial involvement in the 12-lead electrocardiogram (ECG), which could help predict complications. Objective: Use intake ECGs from COVID-19 patients to train AI models to predict risk of mortality or major adverse cardiovascular events (MACE). Methods: We studied intake ECGs from 1448 COVID-19 patients (60.5% male, 63.4±16.9 years). Records were labeled by mortality (death vs. discharge) or MACE (no events vs. arrhythmic, heart failure [HF], or thromboembolic [TE] events), then used to train AI models; these were compared to conventional regression models developed using demographic and comorbidity data. Results: 245 (17.7%) patients died (67.3% male, 74.5±14.4 years); 352 (24.4%) experienced at least one MACE (119 arrhythmic; 107 HF; 130 TE). AI models predicted mortality and MACE with area under the curve (AUC) values of 0.60±0.05 and 0.55±0.07, respectively; these were comparable to AUC values for conventional models (0.73±0.07 and 0.65±0.10). There were no prominent temporal trends in mortality rate or MACE incidence in our cohort; holdout testing with data from after a cutoff date (June 9, 2020) did not degrade model performance. Conclusion: Using intake ECGs alone, our AI models had limited ability to predict hospitalized COVID-19 patients' risk of mortality or MACE. Our models' accuracy was comparable to that of conventional models built using more in-depth information, but translation to clinical use would require higher sensitivity and positive predictive value. In the future, we hope that mixed-input AI models utilizing both ECG and clinical data may be developed to enhance predictive accuracy
    corecore