8 research outputs found

    Method selection and adaptation for distributed monitoring of infectious diseases for syndromic surveillance

    Get PDF
    AbstractBackgroundAutomated surveillance systems require statistical methods to recognize increases in visit counts that might indicate an outbreak. In prior work we presented methods to enhance the sensitivity of C2, a commonly used time series method. In this study, we compared the enhanced C2 method with five regression models.MethodsWe used emergency department chief complaint data from US CDC BioSense surveillance system, aggregated by city (total of 206 hospitals, 16 cities) during 5/2008–4/2009. Data for six syndromes (asthma, gastrointestinal, nausea and vomiting, rash, respiratory, and influenza-like illness) was used and was stratified by mean count (1–19, 20–49, ⩾50 per day) into 14 syndrome-count categories. We compared the sensitivity for detecting single-day artificially-added increases in syndrome counts. Four modifications of the C2 time series method, and five regression models (two linear and three Poisson), were tested. A constant alert rate of 1% was used for all methods.ResultsAmong the regression models tested, we found that a Poisson model controlling for the logarithm of total visits (i.e., visits both meeting and not meeting a syndrome definition), day of week, and 14-day time period was best. Among 14 syndrome-count categories, time series and regression methods produced approximately the same sensitivity (<5% difference) in 6; in six categories, the regression method had higher sensitivity (range 6–14% improvement), and in two categories the time series method had higher sensitivity.DiscussionWhen automated data are aggregated to the city level, a Poisson regression model that controls for total visits produces the best overall sensitivity for detecting artificially added visit counts. This improvement was achieved without increasing the alert rate, which was held constant at 1% for all methods. These findings will improve our ability to detect outbreaks in automated surveillance system data

    Method selection and adaptation for distributed monitoring of infectious diseases for syndromic surveillance

    Get PDF
    AbstractBackgroundAutomated surveillance systems require statistical methods to recognize increases in visit counts that might indicate an outbreak. In prior work we presented methods to enhance the sensitivity of C2, a commonly used time series method. In this study, we compared the enhanced C2 method with five regression models.MethodsWe used emergency department chief complaint data from US CDC BioSense surveillance system, aggregated by city (total of 206 hospitals, 16 cities) during 5/2008–4/2009. Data for six syndromes (asthma, gastrointestinal, nausea and vomiting, rash, respiratory, and influenza-like illness) was used and was stratified by mean count (1–19, 20–49, ⩾50 per day) into 14 syndrome-count categories. We compared the sensitivity for detecting single-day artificially-added increases in syndrome counts. Four modifications of the C2 time series method, and five regression models (two linear and three Poisson), were tested. A constant alert rate of 1% was used for all methods.ResultsAmong the regression models tested, we found that a Poisson model controlling for the logarithm of total visits (i.e., visits both meeting and not meeting a syndrome definition), day of week, and 14-day time period was best. Among 14 syndrome-count categories, time series and regression methods produced approximately the same sensitivity (<5% difference) in 6; in six categories, the regression method had higher sensitivity (range 6–14% improvement), and in two categories the time series method had higher sensitivity.DiscussionWhen automated data are aggregated to the city level, a Poisson regression model that controls for total visits produces the best overall sensitivity for detecting artificially added visit counts. This improvement was achieved without increasing the alert rate, which was held constant at 1% for all methods. These findings will improve our ability to detect outbreaks in automated surveillance system data

    Table of Contents

    Get PDF

    Multivariate syndromic surveillance for cattle diseases: Epidemic simulation and algorithm performance evaluation.

    Get PDF
    Multivariate Syndromic Surveillance (SyS) systems that simultaneously assess and combine information from different data sources are especially useful for strengthening surveillance systems for early detection of infectious disease epidemics. Despite the strong motivation for implementing multivariate SyS and there being numerous methods reported, the number of operational multivariate SyS systems in veterinary medicine is still very small. One possible reason is that assessing the performance of such surveillance systems remains challenging because field epidemic data are often unavailable. The objective of this study is to demonstrate a practical multivariate event detection method (directionally sensitive multivariate control charts) that can be easily applied in livestock disease SyS, using syndrome time series data from the Swiss cattle population as an example. We present a standardized method for simulating multivariate epidemics of different diseases using four diseases as examples: Bovine Virus Diarrhea (BVD), Infectious Bovine Rhinotracheitis (IBR), Bluetongue virus (BTV) and Schmallenberg virus (SV). Two directional multivariate control chart algorithms, Multivariate Exponentially Weighted Moving Average (MEWMA) and Multivariate Cumulative Sum (MCUSUM) were compared. The two algorithms were evaluated using 12 syndrome time series extracted from two Swiss national databases. The two algorithms were able to detect all simulated epidemics around 4.5 months after the start of the epidemic, with a specificity of 95%. However, the results varied depending on the algorithm and the disease. The MEWMA algorithm always detected epidemics earlier than the MCUSUM, and epidemics of IBR and SV were detected earlier than epidemics of BVD and BTV. Our results show that the two directional multivariate control charts are promising methods for combining information from multiple time series for early detection of subtle changes in time series from a population without producing an unreasonable amount of false alarms. The approach that we used for simulating multivariate epidemics is relatively easy to implement and could be used in other situations where real epidemic data are unavailable. We believe that our study results can support the implementation and assessment of multivariate SyS systems in animal health

    Machine Learning for Disease Outbreak Detection Using Probabilistic Models

    Get PDF
    RÉSUMÉ L’expansion de maladies connues et l’émergence de nouvelles maladies ont affecté la vie de nombreuses personnes et ont eu des conséquences économiques importantes. L’Ébola n’est que le dernier des exemples récents. La détection précoce d’infections épidémiologiques s’avère donc un enjeu de taille. Dans le secteur de la surveillance syndromique, nous avons assisté récemment à une prolifération d’algorithmes de détection d’épidémies. Leur performance peut varier entre eux et selon différents paramètres de configuration, de sorte que l’efficacité d’un système de surveillance épidémiologique s’en trouve d’autant affecté. Pourtant, on ne possède que peu d’évaluations fiables de la performance de ces algorithmes sous différentes conditions et pour différents types d’épidémie. Les évaluations existantes sont basées sur des cas uniques et les données ne sont pas du domaine public. Il est donc difficile de comparer ces algorithmes entre eux et difficile de juger de la généralisation des résultats. Par conséquent, nous ne sommes pas en mesure de déterminer quel d’algorithme devrait être appliqué dans quelles circonstances. Cette thèse poursuit trois objectifs généraux : (1) établir la relation entre la performance des algorithmes de détection d’épidémies et le type et la sévérité de ces épidémies, (2) améliorer les prédictions d’épidémies par la combinaison d’algorithmes et (3) fournir une méthode d’analyse des épidémies qui englobe une perspective de coûts afin de minimiser l’impact économique des erreurs du type faux positifs et faux négatifs. L’approche générale de notre étude repose sur l’utilisation de données de simulation d’épidémies dont le vecteur de transmission est un réseau d’aqueducs. Les données sont obtenues de la plateforme de simulation SnAP du Department of Epidemiology and Biostatistics Surveillance Lab de l’université McGill. Cette approche nous permet de créer les différentes conditions de types et d’intensités d’épidémiologie nécessaires à l’analyse de la performance des algorithmes de détection. Le premier objectif porte sur l’influence des différents types et différentes intensités d’épidémiologie sur la performance des algorithmes. Elle est modélisée à l’aide d’un modèle basé sur un réseau bayésien. Ce modèle prédit avec succès la variation de performance observée dans les données. De plus, l’utilisation d’un réseau bayésien permet de quantifier l’influence de chaque variable et relève aussi le rôle que jouent d’autres paramètres qui étaient jusqu’ici ignorés dans les travaux antérieurs, à savoir le seuil de détection et l’importance de tenir compte de récurrences hebdomadaires. Le second objectif vise à exploiter les résultats autour du premier objectif et de combiner les algorithmes pour optimiser la performance en fonction des facteurs d’influence. Les résultats des algorithmes sont combinés à l’aide de la méthode de Mixture hiérarchique d’expert (Hierarchical Mixture of Experts—HME). Le modèle HME est entraîné à pondérer la contribution de chaque algorithme en fonction des données. Les résultats de cette combinaison des résultats d’algorithmes sont comparables avec les meilleurs résultats des algorithmes individuels, et s’avèrent plus robustes à travers différentes variations. Le niveau de contamination n’influence pas la performance relative du modèle HME. Finalement, nous avons tenté d’optimiser des méthodes de détection d’épidémies en fonction des coûts et bénéfices escomptés des prédictions correctes et incorrects. Les résultats des algorithms de détection sont évalués en fonction des décisions possibles qui en découlent et en tenant compte de données réelles sur les coûts totaux d’utilisation des ressources du système de santé. Dans un premier temps, une régression polynomiale permet d’estimer le coût d’une épidémie selon le délai de détection. Puis, nous avons développé un modèle d’apprentissage d’arbre de décision qui tient compte du coût et qui prédit les détections à partir des algorithmes connus. Les résultats expérimentaux démontrent que ce modèle permet de réduire le coût total des épidémies, tout en maintenant le niveau de détection des épidémies comparables à ceux d’autres méthodes.----------ABSTRACT The past decade has seen the emergence of new diseases or expansion of old ones (such as Ebola) causing high human and financial costs. Hence, early detection of disease outbreaks is crucial. In the field of syndromic surveillance, there has recently been a proliferation of outbreak detection algorithms. The choice of outbreak detection algorithm and its configuration can result in important variations in the performance of public health surveillance systems. But performance evaluations have not kept pace with algorithm development. These evaluations are usually based on a single data set which is not publicly available, so the evaluations are difficult to generalize or replicate. Furthermore, the performance of different algorithms is influenced by the nature of the disease outbreak. As a result of the lack of thorough performance evaluations, one cannot determine which algorithm should be applied under what circumstances. Briefly, this research has three general objectives: (1) characterize the dependence of the performance of detection algorithms on the type and severity of outbreak, (2) aggregate the predictions of several outbreak detection algorithms, (3) analyze outbreak detection methods from a cost-benefit point of view and develop a detection method which minimizes the total cost of missing outbreaks and false alarms. To achieve the first objective, we propose a Bayesian network model learned from simulated outbreak data overlaid on real healthcare utilization data which predicts detection performance as a function of outbreak characteristics and surveillance system parameters. This model predicts the performance of outbreak detection methods with high accuracy. The model can also quantify the influence of different outbreak characteristics and detection methods on detection performance in a variety of practically relevant surveillance scenarios. In addition to identifying outbreak characteristics expected to have a strong influence on detection performance, the learned model suggests a role for other algorithm features, such as alerting threshold and taking weekly patterns into account, which was previously not the focus of attention in the literature. To achieve the second objective, we use Hierarchical Mixture of Experts (HME) to combine the responses of multiple experts (i.e., predictors) which are outbreak detection methods. The contribution of each predictor in forming the final output is learned and depends on the input data. The developed HME algorithm is competitive with the best detection algorithm in the experimental evaluation, and is more robust under different circumstances. The level of contamination of the surveillance time series does not influence the relative performance of the HME. The optimization of outbreak detection methods also relies on the estimation of future benefits of true alarms and the cost of false alarms. In the third part of the thesis, we analyze some commonly used outbreak detection methods in terms of the cost of missing outbreaks and false alarms, using simulated outbreak data overlaid on real healthcare utilization data. We estimate the total cost of missing outbreaks and false alarms, in addition to the accuracy of outbreak detection and we fit a polynomial regression function to estimate the cost of an outbreak based on the delay until it is detected. Then, we develop a cost-sensitive decision tree learner, which predicts outbreaks by looking at the prediction of commonly used detection methods. Experimental results show that using the developed cost-sensitive decision tree decreases the total cost of the outbreak, while the accuracy of outbreak detection remains competitive with commonly used methods

    Postmarket sequential database surveillance of medical products

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Engineering Systems Division, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 193-212).This dissertation focuses on the capabilities of a novel public health data system - the Sentinel System - to supplement existing postmarket surveillance systems of the U.S. Food and Drug Administration (FDA). The Sentinel System is designed to identify and assess safety risks associated with drugs, therapeutic biologics, vaccines, and medical devices that emerge post-licensure. Per the initiating legislation, the FDA must complete a priori evaluations of the Sentinel System's technical capabilities to support regulatory decision-making. This research develops qualitative and quantitative tools to aid the FDA in such evaluations, particularly with regard to the Sentinel System's novel sequential database surveillance capabilities. Sequential database surveillance is a "near real-time" sequential statistical method to evaluate pre-specified exposure-outcome pairs. A "signal" is detected when the data suggest an excess risk that is statistically significant. The qualitative tool - the Sentinel System Pre- Screening Checklist - is designed to determine whether the Sentinel System is well suited, on its face, to evaluate a pre-specified exposure-outcome pair. The quantitative tool - the Sequential Database Surveillance Simulator - allows the user to explore virtually whether sequential database surveillance of a particular exposure-outcome pair is likely to generate evidence to identify and assess safety risks in a timely manner to support regulatory decision-making. Particular attention is paid to accounting for uncertainties including medical product adoption and utilization, misclassification error, and the unknown true excess risk in the environment. Using vaccine examples and the simulator to illustrate, this dissertation first demonstrates the tradeoffs associated with sample size calculations in sequential statistical analysis, particularly the tradeoff between statistical power and median sample size. Second, it demonstrates differences in performance between various surveillance configurations when using distributed database systems. Third, it demonstrates the effects of misclassification error on sequential database surveillance, and specifically how such errors may be accounted for in the design of surveillance. Fourth, it considers the complexities of modeling new medical product adoption, and specifically, the existence of a "dual market" phenomenon for these new medical products. This finding raises non-trivial generalizability concerns regarding evidence generated via sequential database surveillance when performed immediately post-licensure.by Judith C. Maro.Ph.D

    The use of emergency department electronic health data for syndromic surveillance to enhance public health surveillance programmes in England

    Get PDF
    Public health surveillance allows for the identification and monitoring of trends in human health. Syndromic surveillance is a relatively recent addition to these activities, offering the potential to monitor trends on a (near) real-time basis and is often more timely than may be possible through other, traditional, surveillance routes. Emergency department (ED) syndromic surveillance systems have been developed and successfully operated worldwide. The Public Health England Emergency Department Syndromic Surveillance System (EDSSS) was developed in preparation for the London 2012 Olympic and Paralympic Games and remains as a public health legacy of the Games. This thesis aimed to describe and provide evidence of how emergency department syndromic surveillance (as performed by EDSSS) provides additional benefit to public health surveillance and added value to emergency care services in England. Additionally the potential for further development and future improvements to public health surveillance is described. The EDSSS is shown here to have been successfully used to describe the impact of the rotavirus vaccine, indicating that EDSSS has the potential to be used for future rapid, stand alone, investigation of impact of vaccines in England. In the first cross-national study of its kind, the EDSSS (alongside OSCOUR, its counterpart in France) was successfully used to describe the changes in human health indicators during periods of poor air quality. In addition to reporting on both infectious and non-infectious disease, emergency department syndromic surveillance also successfully described the impacts of human behaviour on ED attendances. During the EURO 2016 football tournament ED attendances were found to differ from the expected during match periods, not only in France the host country, but also in the UK home nations where fans followed team progress from home. The EDSSS is also the first example of a syndromic surveillance system having input into the development of a standardised national dataset, which has been mandated across EDs in England. Primarily aimed to improve patient care and the wider workings of EDs, this improved data collection has resulted in improvements in the EDSSS itself, which was subsequently expanded from a small sentinel to truly national surveillance system. The standardisation of ED data collection and reporting, alongside improved geographical coverage and near real-time surveillance reporting, enabled rapid feedback on the impact of the COVID-19 pandemic on ED attendances in England. EDSSS described general trends in ED attendances, encompassing both infectious and non-infectious indicators, prompting the refinement of public health messaging, encouraging continued use of emergency care as required by the general public. The evidence presented in this thesis has demonstrated where the ED syndromic surveillance has added value for public health surveillance in England, utilising the system flexibility and timeliness of reporting. Successful collaborative working has provided the potential for future cross-system learning for further system development, as well as the ability to work at local, national and potentially international scales

    Syndromic surveillance : made in Europe

    Get PDF
    corecore