36,457 research outputs found

    Contributions to improve the power, efficiency and scope of control-chart methods : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand

    Get PDF
    Listed in 2019 Dean's List of Exceptional ThesesDetection of outliers and other anomalies in multivariate datasets is a particularly difficult problem which spans across a range of systems, such as quality control in factories, microarrays or proteomic analyses, identification of features in image analysis, identifying unauthorized access in network traffic patterns, and detection of changes in ecosystems. Multivariate control charts (MCC) are popular and sophisticated statistical process control (SPC) methods for monitoring characteristics of interest and detecting changes in a multivariate process. These methods are divided into memory-less and memory-type charts which are used to monitor large and small-to-moderate shifts in the process, respectively. For example, the multivariate χ2 is a memory-less control chart that uses only the most current process information and disregards any previous observations; it is typically used where any shifts in the process mean are expected to be relatively large. To increase the sensitivity of the multivariate process control tool for the detection of small-to-moderate shifts in the process mean vector, different multivariate memory-type tools that use information from both the current and previous process observations have been proposed. These tools have proven very useful for multivariate independent normal or "nearly" normal distributed processes. Like most univariate control-chart methods, when the process parameters (i.e., the process mean vector or covariance parameters, or both) are unknown, then MCC methods are based on estimated parameters, and their implementation occurs in two phases. In Phase I (retrospective phase), a historical reference sample is studied to establish the characteristics of the in-control state and evaluate the stability of the process. Once the in-control reference sample has been deemed to be stable, the process parameters are estimated from Phase I, and control chart limits are obtained for use in Phase II. The Phase II aspect initiates ongoing regular monitoring of the process. If successive observed values obtained at the beginning of Phase II fall within specified desired in-control limits, the process is considered to be in control. In contrast, any observed values during Phase II which fall outside the specified control limits indicate that the process may be out of control, and remedial responses are then required. Although conventional MCC are well developed from a statistical point of view, they can be difficult to apply in modern, data-rich contexts. This serious drawback comes from the fact that classical MCC plotting statistics requires the inversion of the covariance matrix, which is typically assumed to be known. In practice, the covariance matrix is seldom known and often empirically estimated, using a sample covariance matrix from historical data. While the empirical estimate of the covariance matrix may be an unbiased and consistent estimator for a low-dimensional data matrix with an adequate prior sample size, it performs inconsistently in high-dimensional settings. In particular, the empirical estimate of the covariance matrix can lead to in ated false-alarm rates and decreased sensitivity of the chart to detect changes in the process. Also, the statistical properties of traditional MCC tools are accurate only if the assumption of multivariate normality is satisfied. However, in many cases, the underlying system is not multivariate normal, and as a result, the traditional charts can be adversely affected. The necessity of this assumption generally restricts the application of traditional control charts to monitoring industrial processes. Most MCC applications also typically focus on monitoring either the process mean vector or the process variability, and they require that the process mean vector be stable, and that the process variability be independent of the process mean. However, in many real-life processes, the process variability is dependent on the mean, and the mean is not necessarily constant. In such cases, it is more appropriate to monitor the coefficient of variation (CV). The univariate CV is the ratio of the standard deviation to the mean of a random variable. As a relative dispersion measure to the mean, it is useful for comparing the variability of populations having very different process means. More recently, MCC methods have been adapted for monitoring the multivariate coefficient of variation (CV). However, to date, studies of multivariate CV control charts have focused on power - the detection of out-of-control parameters in Phase II, while no study has investigated their in-control performance in Phase I. The Phase I data set can contain unusual observations, which are problematic as they can in uence the parameter estimates, resulting in Phase II control charts with reduced power. Relevant Phase I analysis will guide practitioners with the choice of appropriate multivariate CV estimation procedures when the Phase I data contain contaminated samples. In this thesis, we investigated the performance of the most widely adopted memory-type MCC methods: the multivariate cumulative sum (MCUSUM) and the multivariate exponentially weighted moving average (MEWMA) charts, for monitoring shifts in a process mean vector when the process parameters are unknown and estimated from Phase I (chapters 2 and 3). We demonstrate that using a shrinkage estimate of the covariance matrix improves the run-length performance of these methods, particularly when only a small Phase I sample size is available. In chapter 4, we investigate the Phase I performance of a variety of multivariate CV charts, considering both diffuse symmetric and localized CV disturbance scenarios, and using probability to signal (PTS) as a performance measure. We present a new memory-type control chart for monitoring the mean vector of a multivariate normally distributed process, namely, the multivariate homogeneously weighted moving average (MHWMA) control chart (chapter 5). We present the design procedure and compare the run length performance of the proposed MHWMA chart for the detection of small shifts in the process mean vector with a variety of other existing MCC methods. We also present a dissimilarity-based distribution-free control chart for monitoring changes in the centroid of a multivariate ecological community (chapter 6). The proposed chart may be used, for example, to discover when an impact may have occurred in a monitored ecosystem, and is based on a change-point method that does not require prior knowledge of the ecosystem's behaviour before the monitoring begins. A novel permutation procedure is employed to obtain the control-chart limits of the proposed charting test-statistic to obtain a suitable distance-based model of the target ecological community through time. Finally, we propose enhancements to some classical univariate control chart tools for monitoring small shifts in the process mean, for those scenarios where the process variable is observed along with a correlated auxiliary variable (chapters 7 through 9). We provide the design structure of the charts and examine their performance in terms of their run length properties. We compare the run length performance of the proposed charts with several existing charts for detecting a small shift in the process mean. We offer suggestions on the applications of the proposed charts (in chapters 7 and 8), for cases where the exact measurement of the process variable of interest or the auxiliary variable is diffcult or expensive to obtain, but where the rank ordering of its units can be obtained at a negligible cost. Thus, this thesis, in general, will aid practitioners in applying a wider variety of enhanced and novel control chart tools for more powerful and effcient monitoring of multivariate process. In particular, we develop and test alternative methods for estimating covariance matrices of some useful control-charts' tools (chapters 2 and 3), give recommendations on the choice of an appropriate multivariate CV chart in Phase I (chapter 4), present an efficient method for monitoring small shifts in the process mean vector (chapter 5), expand MCC analyses to cope with non-normally distributed datasets (chapter 6) and contribute to methods that allow efficient use of an auxiliary variable that is observed and correlated with the process variable of interest (chapters 7 through 9)

    Multivariate Statistical Process Control Charts: An Overview

    Get PDF
    In this paper we discuss the basic procedures for the implementation of multivariate statistical process control via control charting. Furthermore, we review multivariate extensions for all kinds of univariate control charts, such as multivariate Shewhart-type control charts, multivariate CUSUM control charts and multivariate EWMA control charts. In addition, we review unique procedures for the construction of multivariate control charts, based on multivariate statistical techniques such as principal components analysis (PCA) and partial lest squares (PLS). Finally, we describe the most significant methods for the interpretation of an out-of-control signal.quality control, process control, multivariate statistical process control, Hotelling's T-square, CUSUM, EWMA, PCA, PLS

    A comparison study of distribution-free multivariate SPC methods for multimode data

    Get PDF
    The data-rich environments of industrial applications lead to large amounts of correlated quality characteristics that are monitored using Multivariate Statistical Process Control (MSPC) tools. These variables usually represent heterogeneous quantities that originate from one or multiple sensors and are acquired with different sampling parameters. In this framework, any assumptions relative to the underlying statistical distribution may not be appropriate, and conventional MSPC methods may deliver unacceptable performances. In addition, in many practical applications, the process switches from one operating mode to a different one, leading to a stream of multimode data. Various nonparametric approaches have been proposed for the design of multivariate control charts, but the monitoring of multimode processes remains a challenge for most of them. In this study, we investigate the use of distribution-free MSPC methods based on statistical learning tools. In this work, we compared the kernel distance-based control chart (K-chart) based on a one-class-classification variant of support vector machines and a fuzzy neural network method based on the adaptive resonance theory. The performances of the two methods were evaluated using both Monte Carlo simulations and real industrial data. The simulated scenarios include different types of out-of-control conditions to highlight the advantages and disadvantages of the two methods. Real data acquired during a roll grinding process provide a framework for the assessment of the practical applicability of these methods in multimode industrial applications

    On Data Depth and the Application of Nonparametric Multivariate Statistical Process Control Charts

    Get PDF
    The purpose of this article is to summarize recent research results for constructing nonparametric multivariate control charts with main focus on data depth based control charts. Data depth provides data reduction to large-variable problems in a completely nonparametric way. Several depth measures including Tukey depth are shown to be particularly effective for purposes of statistical process control in case that the data deviates normality assumption. For detecting slow or moderate shifts in the process target mean, the multivariate version of the EWMA is generally robust to non-normal data, so that nonparametric alternatives may be less often required

    A self-learning algorithm for biased molecular dynamics

    Get PDF
    A new self-learning algorithm for accelerated dynamics, reconnaissance metadynamics, is proposed that is able to work with a very large number of collective coordinates. Acceleration of the dynamics is achieved by constructing a bias potential in terms of a patchwork of one-dimensional, locally valid collective coordinates. These collective coordinates are obtained from trajectory analyses so that they adapt to any new features encountered during the simulation. We show how this methodology can be used to enhance sampling in real chemical systems citing examples both from the physics of clusters and from the biological sciences.Comment: 6 pages, 5 figures + 9 pages of supplementary informatio
    corecore