6 research outputs found

    Anomaly Detection in Time Series: Theoretical and Practical Improvements for Disease Outbreak Detection

    Get PDF
    The automatic collection and increasing availability of health data provides a new opportunity for techniques to monitor this information. By monitoring pre-diagnostic data sources, such as over-the-counter cough medicine sales or emergency room chief complaints of cough, there exists the potential to detect disease outbreaks earlier than traditional laboratory disease confirmation results. This research is particularly important for a modern, highly-connected society, where the onset of disease outbreak can be swift and deadly, whether caused by a naturally occurring global pandemic such as swine flu or a targeted act of bioterrorism. In this dissertation, we first describe the problem and current state of research in disease outbreak detection, then provide four main additions to the field. First, we formalize a framework for analyzing health series data and detecting anomalies: using forecasting methods to predict the next day's value, subtracting the forecast to create residuals, and finally using detection algorithms on the residuals. The formalized framework indicates the link between the forecast accuracy of the forecast method and the performance of the detector, and can be used to quantify and analyze the performance of a variety of heuristic methods. Second, we describe improvements for the forecasting of health data series. The application of weather as a predictor, cross-series covariates, and ensemble forecasting each provide improvements to forecasting health data. Third, we describe improvements for detection. This includes the use of multivariate statistics for anomaly detection and additional day-of-week preprocessing to aid detection. Most significantly, we also provide a new method, based on the CuScore, for optimizing detection when the impact of the disease outbreak is known. This method can provide an optimal detector for rapid detection, or for probability of detection within a certain timeframe. Finally, we describe a method for improved comparison of detection methods. We provide tools to evaluate how well a simulated data set captures the characteristics of the authentic series and time-lag heatmaps, a new way of visualizing daily detection rates or displaying the comparison between two methods in a more informative way

    Robust Control Charts for Time Series Data

    Get PDF
    This article presents a control chart for time series data, based on the one-step- ahead forecast errors of the Holt-Winters forecasting method. We use robust techniques to prevent that outliers affect the estimation of the control limits of the chart. Moreover, robustness is important to maintain the reliability of the control chart after the occurrence of alarm observations. The properties of the new control chart are examined in a simulation study and on a real data example.Control chart;Holt-Winters;Non-stationary time series;Out- lier detection;Robustness;Statistical process control

    Robust Control Charts for Time Series Data

    Get PDF
    This article presents a control chart for time series data, based on the one-step- ahead forecast errors of the Holt-Winters forecasting method. We use robust techniques to prevent that outliers affect the estimation of the control limits of the chart. Moreover, robustness is important to maintain the reliability of the control chart after the occurrence of alarm observations. The properties of the new control chart are examined in a simulation study and on a real data example.

    Avaliação do desempenho da carta CUSCORE: estudo comparativo com a estatística CUSUM

    Get PDF
    A evolução da qualidade tem sido fundamental na monitorização de processos produtivos, quer de produtos ou de serviços, e é um conceito que está cada vez mais presente nas organizações. Actualmente existem diversas metodologias que auxiliam e contribuem para o alcance da qualidade, sendo o controlo estatístico do processo uma das que mais se destaca em contexto industrial, representada pela aplicação das cartas de controlo. As cartas de controlo são ferramentas que têm cativado o interesse das organizações que se dedicam a processos industriais modernos. No entanto, existem alguns factores, nomeadamente, a auto-correlação de dados (ocorre quando, num dado instante, uma observação depende de outras ocorridas em instantes antecedentes), que dificultam a interpretação sobre a estabilidade do processo ao nível estatístico. Neste sentido, a presente dissertação tem como objectivo apresentar um estudo comparativo entre o desempenho da carta Trigger de resíduos, e a carta de resíduos quando um processo caracterizado por um modelo auto-regressivo de primeira ordem fica sujeito a perturbações do tipo salto. Esta comparação será feita com base em modelos de simulação construídos no software MATLAB e, posteriormente, serão retiradas conclusões através dos resultados fornecidos pelas medidas de desempenho como o (Average Run Length) e respectivo (Standard Deviation of the Run Length) face a alterações na média do processo. Será realizado também um estudo com o propósito de determinar o intervalo do parâmetro auto-regressivo para o qual os valores de obtidos, quando o processo se encontra sob controlo estatístico, não são significativamente diferentes entre si. Serão indicadas as principais vantagens e as desvantagens das cartas alvo deste estudo, segundo a óptica do utilizador, que aplica cartas de controlo univariadas para monitorizar processos dinâmicos contínuos