9 research outputs found

    Recursive least squares background prediction of univariate syndromic surveillance data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Surveillance of univariate syndromic data as a means of potential indicator of developing public health conditions has been used extensively. This paper aims to improve the performance of detecting outbreaks by using a background forecasting algorithm based on the adaptive recursive least squares method combined with a novel treatment of the Day of the Week effect.</p> <p>Methods</p> <p>Previous work by the first author has suggested that univariate recursive least squares analysis of syndromic data can be used to characterize the background upon which a prediction and detection component of a biosurvellance system may be built. An adaptive implementation is used to deal with data non-stationarity. In this paper we develop and implement the RLS method for background estimation of univariate data. The distinctly dissimilar distribution of data for different days of the week, however, can affect filter implementations adversely, and so a novel procedure based on linear transformations of the sorted values of the daily counts is introduced. Seven-days ahead daily predicted counts are used as background estimates. A signal injection procedure is used to examine the integrated algorithm's ability to detect synthetic anomalies in real syndromic time series. We compare the method to a baseline CDC forecasting algorithm known as the W2 method.</p> <p>Results</p> <p>We present detection results in the form of Receiver Operating Characteristic curve values for four different injected signal to noise ratios using 16 sets of syndromic data. We find improvements in the false alarm probabilities when compared to the baseline W2 background forecasts.</p> <p>Conclusion</p> <p>The current paper introduces a prediction approach for city-level biosurveillance data streams such as time series of outpatient clinic visits and sales of over-the-counter remedies. This approach uses RLS filters modified by a correction for the weekly patterns often seen in these data series, and a threshold detection algorithm from the residuals of the RLS forecasts. We compare the detection performance of this algorithm to the W2 method recently implemented at CDC. The modified RLS method gives consistently better sensitivity at multiple background alert rates, and we recommend that it should be considered for routine application in bio-surveillance systems.</p

    Surveillance of gastrointestinal disease in France using drug sales data

    Get PDF
    AbstractDrug sales data have increasingly been used for disease surveillance during recent years. Our objective was to assess the value of drug sales data as an operational early detection tool for gastroenteritis epidemics at national and regional level in France. For the period 2008–2013, we compared temporal trends of drug sales for the treatment of gastroenteritis with trends of cases reported by a Sentinel Network of general practitioners. We benchmarked detection models to select the one with the best sensitivity, false alert proportion and timeliness, and developed a prospective framework to assess the operational performance of the system. Drug sales data allowed the detection of seasonal gastrointestinal epidemics occurring in winter with a distinction between prescribed and non-prescribed drugs. Sales of non-prescribed drugs allowed epidemic detection on average 2.25 weeks earlier than Sentinel data. These results confirm the value of drug sales data for real-time monitoring of gastroenteritis epidemic activity

    Anomaly Detection in Time Series: Theoretical and Practical Improvements for Disease Outbreak Detection

    Get PDF
    The automatic collection and increasing availability of health data provides a new opportunity for techniques to monitor this information. By monitoring pre-diagnostic data sources, such as over-the-counter cough medicine sales or emergency room chief complaints of cough, there exists the potential to detect disease outbreaks earlier than traditional laboratory disease confirmation results. This research is particularly important for a modern, highly-connected society, where the onset of disease outbreak can be swift and deadly, whether caused by a naturally occurring global pandemic such as swine flu or a targeted act of bioterrorism. In this dissertation, we first describe the problem and current state of research in disease outbreak detection, then provide four main additions to the field. First, we formalize a framework for analyzing health series data and detecting anomalies: using forecasting methods to predict the next day's value, subtracting the forecast to create residuals, and finally using detection algorithms on the residuals. The formalized framework indicates the link between the forecast accuracy of the forecast method and the performance of the detector, and can be used to quantify and analyze the performance of a variety of heuristic methods. Second, we describe improvements for the forecasting of health data series. The application of weather as a predictor, cross-series covariates, and ensemble forecasting each provide improvements to forecasting health data. Third, we describe improvements for detection. This includes the use of multivariate statistics for anomaly detection and additional day-of-week preprocessing to aid detection. Most significantly, we also provide a new method, based on the CuScore, for optimizing detection when the impact of the disease outbreak is known. This method can provide an optimal detector for rapid detection, or for probability of detection within a certain timeframe. Finally, we describe a method for improved comparison of detection methods. We provide tools to evaluate how well a simulated data set captures the characteristics of the authentic series and time-lag heatmaps, a new way of visualizing daily detection rates or displaying the comparison between two methods in a more informative way

    An Investigation of the Public Health Informatics Research and Practice in the Past Fifteen Years from 2000 to 2014: A Scoping Review in MEDLINE

    Get PDF
    Objective: To examine the extent and nature of existing Public Health Informatics (PHI) studies in the past 15 years on MEDLINE. Methods: This thesis adopted the scientific scoping review methodology recommended by Arksey and O’Malley in 2005. It proceeded with the five main stages, which were: Stage I - identifying the research question; Stage II - identifying relevant studies; Stage III - study selection; Stage IV - charting the data; and Stage V - collating, summarizing, and reporting the results. Each methodological stage was carried out with the joint collaboration with the academic supervisor and a final result and conclusion were set forth. Results: The results of this study captured a total number of 486 articles in MEDLINE focused in PHI. Out of them, a majority belonged to the USA followed by the UK, Australia and Canada. Only about one fifth of the articles were from the rest of the world. Further, About 60% of the articles represented infectious disease monitoring, outbreak detection, and bio-terrorism surveillance. Furthermore, about 10% belonged to chronic disease monitoring; whereas public health policy system and research represented 40% of the total articles. The most frequently used information technology were electronic registry, website, and GIS. In contrast, mass media and mobile phones were among the least used technologies. Conclusion: Despite multiple research and discussions conducted in the past 15 years (starting from 2000), the PHI system requires further improvements in the application of modern PHT such as wireless devices, wearable devices, remote sensors, remote/ cloud computing etc. on various domains of PH, which were scarcely discussed or used in the available literature

    Syndromic surveillance: reports from a national conference, 2004

    Get PDF
    Overview, Policy, and Systems -- Federal Role in Early Detection Preparedness Systems -- BioSense: Implementation of a National Early Event Detection and Situational Awareness System -- Guidelines for Constructing a Statewide Hospital Syndromic Surveillance Network -- -- Data Sources -- Implementation of Laboratory Order Data in BioSense Early Event Detection and Situation Awareness System -- Use of Medicaid Prescription Data for Syndromic Surveillance ? New York -- Poison Control Center?Based Syndromic Surveillance for Foodborne Illness -- Monitoring Over-The-Counter Medication Sales for Early Detection of Disease Outbreaks ? New York City -- Experimental Surveillance Using Data on Sales of Over-the-Counter Medications ? Japan, November 2003?April 2004 -- -- Analytic Methods -- Public Health Monitoring Tools for Multiple Data Streams -- Use of Multiple Data Streams to Conduct Bayesian Biologic Surveillance -- Space-Time Clusters with Flexible Shapes -- INFERNO: A System for Early Outbreak Detection and Signature Forecasting -- High-Fidelity Injection Detectability Experiments: a Tool for Evaluating Syndromic Surveillance Systems -- Linked Analysis for Definition of Nurse Advice Line Syndrome Groups, and Comparison to Encounters -- -- Simulation and Other Evaluation Approaches -- Simulation for Assessing Statistical Methods of Biologic Terrorism Surveillance -- An Evaluation Model for Syndromic Surveillance: Assessing the Performance of a Temporal Algorithm -- Evaluation of Syndromic Surveillance Based on National Health Service Direct Derived Data ? England and Wales -- Initial Evaluation of the Early Aberration Reporting System ? Florida -- -- Practice and Experience -- Deciphering Data Anomalies in BioSense -- Syndromic Surveillance on the Epidemiologist?s Desktop: Making Sense of Much Data -- Connecting Health Departments and Providers: Syndromic Surveillance?s Last Mile -- Comparison of Syndromic Surveillance and a Sentinel Provider System in Detecting an Influenza Outbreak ? Denver, Colorado, 2003 -- Ambulatory-Care Diagnoses as Potential Indicators of Outbreaks of Gastrointestinal Illness ? Minnesota -- Emergency Department Visits for Concern Regarding Anthrax ? New Jersey, 2001 -- Hospital Admissions Syndromic Surveillance ? Connecticut, October 2001?June 2004 -- Three Years of Emergency Department Gastrointestinal Syndromic Surveillance in New York City: What Have we Found?"August 26, 2005."Papers from the National Syndromic Surveillance Conference sponsored by the Centers for Disease Control and Prevention, the Tufts Health Care Institute, the Alfred P. Sloan Foundation, held Nov. 3-4, 2004 in Boston, MA."Public health surveillance continues to broaden in scope and intensity. Public health professionals responsible for conducting such surveillance must keep pace with evolving methodologies, models, business rules, policies, roles, and procedures. The third annual Syndromic Surveillance Conference was held in Boston, Massachusetts, during November 3-4, 2004. The conference was attended by 440 persons representing the public health, academic, and private-sector communities from 10 countries and provided a forum for scientific discourse and interaction regarding multiple aspects of public health surveillance." - p. 3Also vailable via the World Wide Web

    Infodemiology to Improve Public Health Situational Awareness: An Investigation of 2010 Pertussis Outbreaks in California, Michigan Ohio

    Get PDF
    As a disease emerges, one of the greatest challenges for public health practitioners is to differentiate between a normal event and a serious outbreak. Typically, information from official sources and surveillance systems had been the only resource. More recently, the field of infodemiology has emerged with a focus on the distribution and determinants of health information on the internet. This research compared official reports of whooping cough with infodemiology sources, specifically news articles, search engine patterns, and Twitter, to assess the timeliness, accuracy, and correlation of these content sources. Within California, Michigan and Ohio, internet search patterns identified the outbreak of pertussis in 2010 four to eleven weeks in advance of official sources, and there was strong correlation between the epidemic curve and search pattern in Michigan and Ohio. Twitter also provided an indicator in advance of official sources in all three states, but only with a single Tweet. Using all three sources to identify indicators was better than any single source used independently. While understanding the data utility is important, it is equally critical to understand the attitudes and perceptions amongst public health leaders regarding infodemiology data to improve situational awareness. A survey of such leaders showed that infodemiology content had the most value in the first stage of situational awareness for identifying early indications of disease outbreaks. News media and internet search were moderately to highly valuable for 70% of respondents, while social media was moderately to highly valuable to 60% of respondents. For both strengthening comprehension of an outbreak and informing future predictions, beliefs were split regarding the level of potential value (if any) that exists. This led to a framework on how to include infodemiology content in public health situational awareness strategies going forward, so limited resources are used as effectively as possible.Doctor of Public Healt

    Modelling Hospital Acquired Clostridium difficile Infections And Its Transmission In Acute Hospital Settings

    Get PDF
    The thesis explored a number of fundamental issues regarding the development of predictive models for hospital acquired Clostridium difficile infection (HA CDI) and its outbreaks. As predictive modeling for hospital acquired infection is still an emerging field and the ability to analyse HA CDI and potential outbreaks are in a developmental stage, the research documented in this thesis is exploratory and preliminary. Predictive modeling for the outbreak of hospital acquired infections can be considered at two levels: population and individual. We provide a comprehensive review regarding modeling methodology in this field at both population level and individual level. The transmission of HA CDI is not well understood. An agent based simulation model was built to evaluate the relative importance of the potential sources of Clostridium difficile (C. difficile) infection in a non-outbreak ward setting in an acute care hospital. The model was calibrated through a two stage procedure which utilized Latin Hypercube Sampling methodology and Genetic Algorithm optimization to capture five different patterns reported in the literature. A number of aspects of the model including housekeeping, hand hygiene compliance, patient turnover, and antibiotic pressure were explored. Based on the modeling results, several prevention policies are recommended. One widely used tool to better understand the dynamics of infectious disease outbreaks is network epidemiology. We explored the potential of using network statistics for the prediction of the transmission of HA CDIs in the hospital. Two types of dynamic networks were studied: ward level contacts and hospital transfers. An innovative method that combines time series data mining and predictive classification models was introduced for the analysis of these dynamic networks and for the prediction of HA CDI transmission. The results suggest that the network statistics extracted from the dynamic networks are potential predictors for the transmission of HA CDIs. We explored the potential of using the “multiple modeling methods approach” to predict HA CDI patient at risk by using the data from the information systems in the hospital. A range of machine learning predictive models were utilized to analyse collected data from a hospital. Our results suggest that the multiple modeling methods approach is able to improve prediction performance and to reveal new insights in the data set. We recommend that this approach might be considered for future studies on the predictive model construction and risk factor analysis

    An adaptive prediction and detection algorithm for multistream syndromic surveillance

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Surveillance of Over-the-Counter pharmaceutical (OTC) sales as a potential early indicator of developing public health conditions, in particular in cases of interest to biosurvellance, has been suggested in the literature. This paper is a continuation of a previous study in which we formulated the problem of estimating clinical data from OTC sales in terms of optimal LMS linear and Finite Impulse Response (FIR) filters. In this paper we extend our results to predict clinical data multiple steps ahead using OTC sales as well as the clinical data itself.</p> <p>Methods</p> <p>The OTC data are grouped into a few categories and we predict the clinical data using a multichannel filter that encompasses all the past OTC categories as well as the past clinical data itself. The prediction is performed using FIR (Finite Impulse Response) filters and the recursive least squares method in order to adapt rapidly to nonstationary behaviour. In addition, we inject simulated events in both clinical and OTC data streams to evaluate the predictions by computing the Receiver Operating Characteristic curves of a threshold detector based on predicted outputs.</p> <p>Results</p> <p>We present all prediction results showing the effectiveness of the combined filtering operation. In addition, we compute and present the performance of a detector using the prediction output.</p> <p>Conclusion</p> <p>Multichannel adaptive FIR least squares filtering provides a viable method of predicting public health conditions, as represented by clinical data, from OTC sales, and/or the clinical data. The potential value to a biosurveillance system cannot, however, be determined without studying this approach in the presence of transient events (nonstationary events of relatively short duration and fast rise times). Our simulated events superimposed on actual OTC and clinical data allow us to provide an upper bound on that potential value under some restricted conditions. Based on our ROC curves we argue that a biosurveillance system can provide early warning of an impending clinical event using ancillary data streams (such as OTC) with established correlations with the clinical data, and a prediction method that can react to nonstationary events sufficiently fast. Whether OTC (or other data streams yet to be identified) provide the best source of predicting clinical data is still an open question. We present a framework and an example to show how to measure the effectiveness of predictions, and compute an upper bound on this performance for the Recursive Least Squares method when the following two conditions are met: (1) an event of sufficient strength exists in both data streams, without distortion, and (2) it occurs in the OTC (or other ancillary streams) earlier than in the clinical data.</p
    corecore