102 research outputs found

    Distribution-free cumulative sum control charts using bootstrap-based control limits

    Full text link
    This paper deals with phase II, univariate, statistical process control when a set of in-control data is available, and when both the in-control and out-of-control distributions of the process are unknown. Existing process control techniques typically require substantial knowledge about the in-control and out-of-control distributions of the process, which is often difficult to obtain in practice. We propose (a) using a sequence of control limits for the cumulative sum (CUSUM) control charts, where the control limits are determined by the conditional distribution of the CUSUM statistic given the last time it was zero, and (b) estimating the control limits by bootstrap. Traditionally, the CUSUM control chart uses a single control limit, which is obtained under the assumption that the in-control and out-of-control distributions of the process are Normal. When the normality assumption is not valid, which is often true in applications, the actual in-control average run length, defined to be the expected time duration before the control chart signals a process change, is quite different from the nominal in-control average run length. This limitation is mostly eliminated in the proposed procedure, which is distribution-free and robust against different choices of the in-control and out-of-control distributions.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS197 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Nonparametric (distribution-free) control charts : an updated overview and some results

    Get PDF
    Control charts that are based on assumption(s) of a specific form for the underlying process distribution are referred to as parametric control charts. There are many applications where there is insufficient information to justify such assumption(s) and, consequently, control charting techniques with a minimal set of distributional assumption requirements are in high demand. To this end, nonparametric or distribution-free control charts have been proposed in recent years. The charts have stable in-control properties, are robust against outliers and can be surprisingly efficient in comparison with their parametric counterparts. Chakraborti and some of his colleagues provided review papers on nonparametric control charts in 2001, 2007 and 2011, respectively. These papers have been received with considerable interest and attention by the community. However, the literature on nonparametric statistical process/quality control/monitoring has grown exponentially and because of this rapid growth, an update is deemed necessary. In this article, we bring these reviews forward to 2017, discussing some of the latest developments in the area. Moreover, unlike the past reviews, which did not include the multivariate charts, here we review both univariate and multivariate nonparametric control charts. We end with some concluding remarks.https://www.tandfonline.com/loi/lqen20hj2020Science, Mathematics and Technology Educatio

    Nonparametric monitoring of sunspot number observations: a case study

    Full text link
    Solar activity is an important driver of long-term climate trends and must be accounted for in climate models. Unfortunately, direct measurements of this quantity over long periods do not exist. The only observation related to solar activity whose records reach back to the seventeenth century are sunspots. Surprisingly, determining the number of sunspots consistently over time has remained until today a challenging statistical problem. It arises from the need of consolidating data from multiple observing stations around the world in a context of low signal-to-noise ratios, non-stationarity, missing data, non-standard distributions and many kinds of errors. The data from some stations experience therefore severe and various deviations over time. In this paper, we propose the first systematic and thorough statistical approach for monitoring these complex and important series. It consists of three steps essential for successful treatment of the data: smoothing on multiple timescales, monitoring using block bootstrap calibrated CUSUM charts and classifying of out-of-control situations by support vector techniques. This approach allows us to detect a wide range of anomalies (such as sudden jumps or more progressive drifts), unseen in previous analyses. It helps us to identify the causes of major deviations, which are often observer or equipment related. Their detection and identification will contribute to improve future observations. Their elimination or correction in past data will lead to a more precise reconstruction of the world reference index for solar activity: the International Sunspot Number.Comment: 27 pages (without appendices), 6 figure

    Nonparametric signed‐rank control charts with variable sampling intervals

    Get PDF
    Variable sampling interval (VSI) charts have been proposed in the literature for normal theory (parametric) control charts and are known to provide performance enhancements. In the VSI setting, the time between monitored samples is allowed to vary depending on what is observed in the current sample. Nonparametric (distribution‐free) control charts have recently come to play an important role in statistical process control and monitoring. In this paper a nonparametric Shewhart‐type VSI control chart is considered for detecting changes in a specified location parameter. The proposed chart is based on the Wilcoxon signed‐rank statistic and is called the VSI signed‐rank chart. The VSI signed‐rank chart is compared with an existing fixed sampling interval signed‐rank chart, the parametric VSI X‐chart, and the nonparametric VSI sign chart. Results show that the VSI signed‐rank chart often performs favourably and should be used.The South African Research Chairs Initiative at the University of Pretoria and by the Department of Information Systems, Statistics and Management Science, University of Alabama. Marien Graham's research was also supported by the National Research Foundation (Thuthuka programme: TTK14061168807; grant number: 94102), SARCHI Award to the third author from the National Research Foundation.http://wileyonlinelibrary.com/journal/qre2018-12-21hj2018Statistic

    On the average run length to false alarm in surveillance problems which possess an invariance structure

    Full text link

    Real-Time Machine Learning for Quickest Detection

    Get PDF
    Safety-critical Cyber-Physical Systems (CPS) require real-time machine learning for control and decision making. One promising solution is to use deep learning to discover useful patterns for event detection from heterogeneous data. However, deep learning algorithms encounter challenges in CPS with assurability requirements: 1) Decision explainability, 2) Real-time and quickest event detection, and 3) Time-eficient incremental learning. To address these obstacles, I developed a real-time Machine Learning Framework for Quickest Detection (MLQD). To be specific, I first propose the zero-bias neural network, which removes decision bias and preferabilities from regular neural networks and provides an interpretable decision process. Second, I discover the latent space characteristic of the zero-bias neural network and the method to mathematically convert a Deep Neural Network (DNN) classifier into a performance-assured binary abnormality detector. In this way, I can seamlessly integrate the deep neural networks\u27 data processing capability with Quickest Detection (QD) and provide real-time sequential event detection paradigm. Thirdly, after discovering that a critical factor that impedes the incremental learning of neural networks is the concept interference (confusion) in latent space, and I prove that to minimize interference, the concept representation vectors (class fingerprints) within the latent space need to be organized orthogonally and I invent a new incremental learning strategy using the findings, I facilitate deep neural networks in the CPS to evolve eficiently without retraining. All my algorithms are evaluated on real-world applications, ADS-B (Automatic Dependent Surveillance Broadcasting) signal identification, and spoofing detection in the aviation communication system. Finally, I discuss the current trends in MLQD and conclude this dissertation by presenting the future research directions and applications. As a summary, the innovations of this dissertation are as follows: i) I propose the zerobias neural network, which provides transparent latent space characteristics, I apply it to solve the wireless device identification problem. ii) I discover and prove the orthogonal memory organization mechanism in artificial neural networks and apply this mechanism in time-efficient incremental learning. iii) I discover and mathematically prove the converging point theorem, with which we can predict the latent space topological characteristics and estimate the topological maturity of neural networks. iv) I bridge the gap between machine learning and quickest detection with assurable performance

    Robust Statistical Inference Through the Lens of Optimization

    Get PDF
    Statistical signal processing and hypothesis testing are fundamental problems in modern data science and engineering applications. This thesis mainly focuses on developing new theories and algorithms for three research problems in the area of robust statistical inference. The first problem we study is sequential change detection. We consider the subspace change for the covariance matrix of high-dimensional data sequences, which is a fundamental problem since subspace structure is commonly used for modeling high-dimensional data. We also consider a non-parametric setting that can be useful when the data distributions cannot be easily represented by simple parametric families, and the weighted L2 divergence is proposed to detect the change. The second problem we study is data-driven robust hypothesis testing when the true data-generating distributions are all unknown and we only have access to a limited number of training samples. A strong duality result is proved and used to find the robust optimal test by convex optimization. The third problem is parameter recovery for spatio-temporal models by solving variational inequalities, with an application example in modeling crime events.Ph.D

    A performance analysis of multivariate nonparametric control charts

    Get PDF
    Robust and efficient multivariate control charts are not common in literature. This report explores the versatility of the few distribution-free, nonparametric multivariate Statistical Process Control (MSPC) charts suitable for average run length (ARL) analysis. Current datasets are becoming increasingly complex, large, and less likely to follow distributional properties required for traditional parametric statistics, a fact especially true for a multivariate setting. The purpose of our study is to compare the newest available methods, not previously compared with one another in cases and data structures not yet explored. Due to the versatility and robustness of the types of data these methods can accommodate, finding real world applications is trivial. The five methods applied here are able to exploit different types of changes to the structure of a distribution, rather than simply detect a mean shift. These methods have similar features, able to avoid lengthy data-gathering steps, and applicable in short-run and start up situations. By establishing cut-off values simultaneously based on input observations, rather than beforehand, the methods are applying data-dependent control limits which shows their truly distribution-free property. Some of the current areas of improvement continue to be on creating more computationally efficient algorithms for these methods

    On Rank Energy Statistics via Optimal Transport: Continuity, Convergence, and Change Point Detection

    Full text link
    This paper considers the use of recently proposed optimal transport-based multivariate test statistics, namely rank energy and its variant the soft rank energy derived from entropically regularized optimal transport, for the unsupervised nonparametric change point detection (CPD) problem. We show that the soft rank energy enjoys both fast rates of statistical convergence and robust continuity properties which lead to strong performance on real datasets. Our theoretical analyses remove the need for resampling and out-of-sample extensions previously required to obtain such rates. In contrast the rank energy suffers from the curse of dimensionality in statistical estimation and moreover can signal a change point from arbitrarily small perturbations, which leads to a high rate of false alarms in CPD. Additionally, under mild regularity conditions, we quantify the discrepancy between soft rank energy and rank energy in terms of the regularization parameter. Finally, we show our approach performs favorably in numerical experiments compared to several other optimal transport-based methods as well as maximum mean discrepancy.Comment: 36 pages, 5 figure
    • 

    corecore