13 research outputs found

    Recent developments of control charts and identification of big data sources and future trends of current research

    Get PDF
    Control charts are one of the principal tools to monitor dynamic processes with the aim of rapid identification of changes in the behaviour of these processes. Such changes are usually associated with a move from an in-control condition to an out-of-control condition. The paper briefly reviews the historical origins and includes examples of recent developments, focussing on their use in fields different from the industrial applications in which they were initially derived and often employed. It also focusses on cases which depart from the commonly used Gaussian assumption and then considers potential effects of the big data revolution on future uses. A bibliometric analysis is also presented to identify distinct groups of research themes, including emerging and underdeveloped areas, which are hence potential topics for future research

    Novel Methods for Efficient Changepoint Detection

    Get PDF
    This thesis introduces several novel computationally efficient methods for offline and online changepoint detection. The first part of the thesis considers the challenge of detecting abrupt changes in scenarios where there is some autocorrelated noise or where the mean fluctuates locally between the changes. In such situations, existing implementations can lead to substantial overestimation of the number of changes. In response to this challenge, we introduce DeCAFS, an efficient dynamic programming algorithm to deal with such scenarios. DeCAFS models local fluctuations as a random walk process and autocorrelated noise as an AR(1) process. Through theory and empirical studies we demonstrate that this approach has greater power at detecting abrupt changes than existing approaches. The second part of the thesis considers a practical, computational challenge that can arise with online changepoint detection within the real-time domain. We introduce a new procedure, called FOCuS, a fast online changepoint detection algorithm based on the simple Page-CUSUM sequential likelihood ratio test. FOCuS enables the online changepoint detection problem to be solved sequentially in time, through an efficient dynamic programming recursion. In particular, we establish that FOCuS outperforms current state-of-the-art algorithms both in terms of efficiency and statistical power, and can be readily extended to more general scenarios. The final part of the thesis extends ideas from the nonparametric changepoint detection literature to the online setting. Specifically, a novel algorithm, NUNC, is introduced to perform an online detection for changes in the distribution of real-time data. We explore the properties of two variants of this algorithm using both simulated and real data examples

    A DATA ANALYTICAL FRAMEWORK FOR IMPROVING REAL-TIME, DECISION SUPPORT SYSTEMS IN HEALTHCARE

    Get PDF
    In this dissertation we develop a framework that combines data mining, statistics and operations research methods for improving real-time decision support systems in healthcare. Our approach consists of three main concepts: data gathering and preprocessing, modeling, and deployment. We introduce the notion of offline and semi-offline modeling to differentiate between models that are based on known baseline behavior and those based on a baseline with missing information. We apply and illustrate the framework in the context of two important healthcare contexts: biosurveillance and kidney allocation. In the biosurveillance context, we address the problem of early detection of disease outbreaks. We discuss integer programming-based univariate monitoring and statistical and operations research-based multivariate monitoring approaches. We assess method performance on authentic biosurveillance data. In the kidney allocation context, we present a two-phase model that combines an integer programming-based learning phase and a data-analytical based real-time phase. We examine and evaluate our method on the current Organ Procurement and Transplantation Network (OPTN) waiting list. In both contexts, we show that our framework produces significant improvements over existing methods

    Syndromic surveillance: reports from a national conference, 2003

    Get PDF
    Overview of Syndromic Surveillance -- What is Syndromic Surveillance? -- Linking Better Surveillance to Better Outcomes -- Review of the 2003 National Syndromic Surveillance Conference - Lessons Learned and Questions To Be Answered -- -- System Descriptions -- New York City Syndromic Surveillance Systems -- Syndrome and Outbreak Detection Using Chief-Complaint Data - Experience of the Real-Time Outbreak and Disease Surveillance Project -- Removing a Barrier to Computer-Based Outbreak and Disease Surveillance - The RODS Open Source Project -- National Retail Data Monitor for Public Health Surveillance -- National Bioterrorism Syndromic Surveillance Demonstration Program -- Daily Emergency Department Surveillance System - Bergen County, New Jersey -- Hospital Admissions Syndromic Surveillance - Connecticut, September 2001-November 2003 -- BioSense - A National Initiative for Early Detection and Quantification of Public Health Emergencies -- Syndromic Surveillance at Hospital Emergency Departments - Southeastern Virginia -- -- Research Methods -- Bivariate Method for Spatio-Temporal Syndromic Surveillance -- Role of Data Aggregation in Biosurveillance Detection Strategies with Applications from ESSENCE -- Scan Statistics for Temporal Surveillance for Biologic Terrorism -- Approaches to Syndromic Surveillance When Data Consist of Small Regional Counts -- Algorithm for Statistical Detection of Peaks - Syndromic Surveillance System for the Athens 2004 Olympic Games -- Taming Variability in Free Text: Application to Health Surveillance -- Comparison of Two Major Emergency Department-Based Free-Text Chief-Complaint Coding Systems -- How Many Illnesses Does One Emergency Department Visit Represent? Using a Population-Based Telephone Survey To Estimate the Syndromic Multiplier -- Comparison of Office Visit and Nurse Advice Hotline Data for Syndromic Surveillance - Baltimore-Washington, D.C., Metropolitan Area, 2002 -- Progress in Understanding and Using Over-the-Counter Pharmaceuticals for Syndromic Surveillance -- -- Evaluation -- Evaluation Challenges for Syndromic Surveillance - Making Incremental Progress -- Measuring Outbreak-Detection Performance By Using Controlled Feature Set Simulations -- Evaluation of Syndromic Surveillance Systems - Design of an Epidemic Simulation Model -- Benchmark Data and Power Calculations for Evaluating Disease Outbreak Detection Methods -- Bio-ALIRT Biosurveillance Detection Algorithm Evaluation -- ESSENCE II and the Framework for Evaluating Syndromic Surveillance Systems -- Conducting Population Behavioral Health Surveillance by Using Automated Diagnostic and Pharmacy Data Systems -- Evaluation of an Electronic General-Practitioner-Based Syndromic Surveillance System -- National Symptom Surveillance Using Calls to a Telephone Health Advice Service - United Kingdom, December 2001-February 2003 -- Field Investigations of Emergency Department Syndromic Surveillance Signals - New York City -- Should We Be Worried? Investigation of Signals Generated by an Electronic Syndromic Surveillance System - Westchester County, New York -- -- Public Health Practice -- Public Health Information Network - Improving Early Detection by Using a Standards-Based Approach to Connecting Public Health and Clinical Medicine -- Information System Architectures for Syndromic Surveillance -- Perspective of an Emergency Physician Group as a Data Provider for Syndromic Surveillance -- SARS Surveillance Project - Internet-Enabled Multiregion Surveillance for Rapidly Emerging Disease -- Health Information Privacy and Syndromic Surveillance SystemsPapers from the second annual National Syndromic Surveillance Conference convened by the New York City Department of Health and Mental Hygiene, the New York Academy of Medicine, and the CDC in New York City during Oct. 23-24, 2003. Published as the September 24, 2004 supplement to vol. 53 of MMWR. Morbidity and mortality weekly report.1571461

    A study of new and advanced control charts for two categories of time related processes

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Smart Sensing in Advanced Manufacturing Processes: Statistical Modeling and Implementations for Quality Assurance and Automation

    Get PDF
    With recent breakthroughs in sensing technology, data informatics and computer networks, smart manufacturing with intertwined advanced computation, communication and control techniques promotes the transformation of conventional discrete manufacturing processes into the new paradigm of cyber-physical manufacturing systems. The cybermanufacturing systems should be predictive and instantly responsive to incident prevention for quality assurance. Thus, providing viable in-process monitoring approaches for real-time quality assurance is one essential research topic in cybermanufacturing system to allow a closed-loop control of the processes, ensure the quality of products, and consequently improve the whole shop floor efficiency. However, thus far, such in-process monitoring tools are still underdeveloped on the following counts: • For precision/ultraprecision machining processes, most sensor-based change detection approaches are reticent to the anomalies since they largely root in the stationary assumption whilst the underlying dynamics under precision machining processes exhibit intermittent patterns. Therefore, existing approaches are feeble to detect subtle variations which are detrimental to the process; • For shaping processes that realize complicated geometries, currently there is no viable tool to allow a noncontact monitoring on surface morphology evolution that measures critical dimensioning criteria in real time. • For precision machining processes, we aim to present advanced smart sensing approaches towards characterizations of the process, specifically, microdynamics reflecting the fundamental cutting mechanisms as well as variations of microstructure of the material surfaces. To address these gaps, this dissertation achieves the following contributions: • For precision and ultraprecision machining processes, an in-situ anomaly detection approach is provided which allows instant prevention from surface deterioration. The method could be applied to various (ultra)precision processes of which most underlying systems are unknow and always exhibit intermittency. Extensive experimental studies suggest that the developed model can detect in-situ anomalies of the underlying dynamic intermittency; • For shaping processes that require noncontact in-process monitoring, a vision-based monitoring approach is presented which rapidly measures the geometric features during forming process on sheet-based workpieces. Investigations into laser origami sheet forming processes suggest that the presented approach can provide precise geometric measurements as feedback in real time for the control loop of the sheeting forming processes in cybermanufacturing systems. • As for smart sensing for precision machining, an advanced in-process sensing/ monitoring approach [including implementations of Acoustic Emission (AE) sensor, the associated data acquisition system and developed advanced machine/deep learning methods] is introduced to connect the AE characteristics to microdynamics of the precision machining of natural fiber reinforced composites. The presented smart sensing framework shows potentials towards real-time estimations/predictions of microdynamics of the machining processes using AE features

    Vol. 15, No. 2 (Full Issue)

    Get PDF

    Data quality in health research: the development of methods to improve the assessment of temporal data quality in electronic health records

    Get PDF
    Background: Electronic health records (EHR) are increasingly used in medical research, but the prevalence of temporal artefacts that may bias study findings is not widely understood or reported. Furthermore, methods aimed at efficient and transparent assessment of temporal data quality in EHR datasets are unfortunately lacking. Methods: 7959 time series representing different measures of data quality were generated from eight different EHR data extracts covering activity between 1986-2019 at a large UK hospital group. These time series were visually inspected and annotated via a citizen-science crowd-sourcing platform, and consensus labels for the locations of all change points (i.e. places where the distribution of data values changed suddenly and unpredictably) were constructed using density-based clustering with noise. The crowd-sourced consensus labels were validated against labels produced by an experienced data scientist, and a diverse range of automated change point detection methods were assessed for accuracy against these consensus labels using a novel approximation to a binary classifier. Lastly, an R package was developed to facilitate assessment of temporal data quality in EHR datasets. Results: Over 2000 volunteers participated in the citizen-science project, performing 341,800 visual inspections of the time series. A total of 4477 distinct change points were identified across the eight data extracts, covering almost every year of data and virtually all data fields. Compared to expert labels, accuracy of crowd-sourced consensus labels identifying the locations of individual change points had high sensitivity 80.4% (95% CI 77.1, 83.3), specificity 99.8% (99.7, 99.8), positive predictive value (PPV) 84.5% (81.4, 87.2) and negative predictive value (NPV) 99.7% (99.6, 99.7). Automated change point detection methods failed to detect the crowd-sourced change points accurately, with maximum sensitivity 36.9% (35.2, 38.8), specificity 100% (100, 100), PPV 51.6% (49.4, 53.8), and NPV 99.9% (99.9, 99.9). Conclusions: This large study of real-world EHR found temporal artefacts occurred with very high frequency, which could impact findings from analyses using these data. Crowd-sourced labels of change points compared favourably to expert labels, but currently-available automated methods performed poorly at identifying such artefacts when compared to human visual inspection. To improve reproducibility and transparency of studies using EHRs, thorough visual assessment of temporal data quality should be conducted and reported, which can be assisted by tools such as the new daiquiri R package developed as part of this thesis

    Research and Development of a General Purpose Instrument DAQ-Monitoring Platform applied to the CLOUD/CERN experiment

    Get PDF
    The current scientific environment has experimentalists and system administrators allocating large amounts of time for data access, parsing and gathering as well as instrument management. This is a growing challenge since there is an increasing number of large collaborations with significant amount of instrument resources, remote instrumentation sites and continuously improved and upgraded scientific instruments. DAQBroker is a new software designed to monitor networks of scientific instruments while also providing simple data access methods for any user. Data can be stored in one or several local or remote databases running on any of the most popular relational databases (MySQL, PostgreSQL, Oracle). It also provides the necessary tools for creating and editing the metadata associated with different instruments, perform data manipulation and generate events based on instrument measurements, regardless of the user’s know-how of individual instruments. Time series stored in a DAQBroker database also benefit from several statistical methods for time series classification, comparison and event detection as well as multivariate time series analysis methods to determine the most statistically relevant time series, rank the most influential time series and also determine the periods of most activity during specific experimental periods. This thesis presents the architecture behind the framework, assesses the performance under controlled conditions and presents a use-case under the CLOUD experiment at CERN, Switzerland. The univariate and multivariate time series statistical methods applied to this framework are also studied.O processo de investigação científica moderno requer que tanto experimentalistas como administradores de sistemas dediquem uma parte significativa do seu tempo a criar estratégias para aceder, armazenar e manipular instrumentos científicos e os dados que estes produzem. Este é um desafio crescente considerando o aumento de colaborações que necessitam de vários instrumentos, investigação em áreas remotas e instrumentos científicos com constantes alterações. O DAQBroker é uma nova plataforma desenhada para a monitorização de instrumentos científicos e ao mesmo tempo fornece métodos simples para qualquer utilizador aceder aos seus dados. Os dados podem ser guardados em uma ou várias bases de dados locais ou remotas utilizando os gestores de bases de dados mais comuns (MySQL, PostgreSQL, Oracle). Esta plataforma também fornece as ferramentas necessárias para criar e editar versões virtuais de instrumentos científicos e manipular os dados recolhidos dos instrumentos, independentemente do grau de conhecimento que o utilizador tenha com o(s) instrumento(s) utilizado(s). Séries temporais guardadas numa base de dados DAQBroker beneficiam de um conjunto de métodos estatísticos para a classificação, comparação e detecção de eventos, determinação das séries com maior influência e os sub-períodos experimentais com maior actividade. Esta tese apresenta a arquitectura da plataforma, os resultados de diversos testes de esforço efectuados em ambientes controlados e um caso real da sua utilização na experiência CLOUD, no CERN, Suíça. São estudados também os métodos de análise de séries temporais, tanto singulares como multivariadas aplicados na plataforma
    corecore