129 research outputs found

    A semi-empirical Bayesian chart to monitor Weibull percentiles

    Full text link
    This paper develops a Bayesian control chart for the percentiles of the Weibull distribution, when both its in-control and out-of-control parameters are unknown. The Bayesian approach enhances parameter estimates for small sample sizes that occur when monitoring rare events as in high-reliability applications or genetic mutations. The chart monitors the parameters of the Weibull distribution directly, instead of transforming the data as most Weibull-based charts do in order to comply with their normality assumption. The chart uses the whole accumulated knowledge resulting from the likelihood of the current sample combined with the information given by both the initial prior knowledge and all the past samples. The chart is adapting since its control limits change (e.g. narrow) during the Phase I. An example is presented and good Average Run Length properties are demonstrated. In addition, the paper gives insights into the nature of monitoring Weibull processes by highlighting the relationship between distribution and process parameters.Comment: 21 pages, 3 figures, 5 table

    Parametric, Nonparametric, and Semiparametric Linear Regression in Classical and Bayesian Statistical Quality Control

    Get PDF
    Statistical process control (SPC) is used in many fields to understand and monitor desired processes, such as manufacturing, public health, and network traffic. SPC is categorized into two phases; in Phase I historical data is used to inform parameter estimates for a statistical model and Phase II implements this statistical model to monitor a live ongoing process. Within both phases, profile monitoring is a method to understand the functional relationship between response and explanatory variables by estimating and tracking its parameters. In profile monitoring, control charts are often used as graphical tools to visually observe process behaviors. We construct a practitioner’s guide to provide a stepby- step application for parametric, nonparametric, and semiparametric methods in profile monitoring, creating an in-depth guideline for novice practitioners. We then consider the commonly used cumulative sum (CUSUM), multivariate CUSUM (mCUSUM), exponentially weighted moving average (EWMA), multivariate EWMA (mEWMA) charts under a Bayesian framework for monitoring respiratory disease related hospitalizations and global suicide rates with parametric, nonparametric, and semiparametric linear models

    A Bayesian Scheme to Detect Changes in the Mean of a Short Run Process

    Get PDF
    1 online resource (PDF, 24 pages

    Predictive ratio CUSUM (PRC): A Bayesian approach in online change point detection of short runs

    Get PDF
    The online quality monitoring of a process with low volume data is a very challenging task and the attention is most often placed in detecting when some of the underline (unknown) process parameter(s) experience a persistent shift. Self-starting methods, both in the frequentist and the Bayesian domain aim to offer a solution. Adopting the latter perspective, we propose a general closed-form Bayesian scheme, where the testing procedure is built on a memory-based control chart that relies on the cumulative ratios of sequentially updated predictive distributions. The theoretic framework can accommodate any likelihood from the regular exponential family and the use of conjugate analysis allows closed form modeling. Power priors will offer the axiomatic framework to incorporate into the model different sources of information, when available. A simulation study evaluates the performance against competitors and examines aspects of prior sensitivity. Technical details and algorithms are provided as supplementary material

    Design and properties of the predictive ratio cusum (PRC) control charts

    Get PDF
    In statistical process control/monitoring (SPC/M), memory-based control charts aim to detect small/medium persistent parameter shifts. When a phase I calibration is not feasible, self-starting methods have been proposed, with the predictive ratio cusum (PRC) being one of them. To apply such methods in practice, one needs to derive the decision limit threshold that will guarantee a preset false alarm tolerance, a very difficult task when the process parameters are unknown and their estimate is sequentially updated. Utilizing the Bayesian framework in PRC, we will provide the theoretic framework that will allow to derive a decision-making threshold, based on false alarm tolerance, which along with the PRC closed-form monitoring scheme will permit its straightforward application in real-life practice. An enhancement of PRC is proposed, and a simulation study evaluates its robustness against competitors for various model type misspecifications. Finally, three real data sets (normal, Poisson, and binomial) illustrate its implementation in practice. Technical details, algorithms, and R-codes reproducing the illustrations are provided as supplementary material

    Detecting abnormal behavior in lithography machines

    Get PDF

    Statistical Methods for Semiconductor Manufacturing

    Get PDF
    In this thesis techniques for non-parametric modeling, machine learning, filtering and prediction and run-to-run control for semiconductor manufacturing are described. In particular, algorithms have been developed for two major applications area: - Virtual Metrology (VM) systems; - Predictive Maintenance (PdM) systems. Both technologies have proliferated in the past recent years in the semiconductor industries, called fabs, in order to increment productivity and decrease costs. VM systems aim of predicting quantities on the wafer, the main and basic product of the semiconductor industry, that may be physically measurable or not. These quantities are usually ’costly’ to be measured in economic or temporal terms: the prediction is based on process variables and/or logistic information on the production that, instead, are always available and that can be used for modeling without further costs. PdM systems, on the other hand, aim at predicting when a maintenance action has to be performed. This approach to maintenance management, based like VM on statistical methods and on the availability of process/logistic data, is in contrast with other classical approaches: - Run-to-Failure (R2F), where there are no interventions performed on the machine/process until a new breaking or specification violation happens in the production; - Preventive Maintenance (PvM), where the maintenances are scheduled in advance based on temporal intervals or on production iterations. Both aforementioned approaches are not optimal, because they do not assure that breakings and wasting of wafers will not happen and, in the case of PvM, they may lead to unnecessary maintenances without completely exploiting the lifetime of the machine or of the process. The main goal of this thesis is to prove through several applications and feasibility studies that the use of statistical modeling algorithms and control systems can improve the efficiency, yield and profits of a manufacturing environment like the semiconductor one, where lots of data are recorded and can be employed to build mathematical models. We present several original contributions, both in the form of applications and methods. The introduction of this thesis will be an overview on the semiconductor fabrication process: the most common practices on Advanced Process Control (APC) systems and the major issues for engineers and statisticians working in this area will be presented. Furthermore we will illustrate the methods and mathematical models used in the applications. We will then discuss in details the following applications: - A VM system for the estimation of the thickness deposited on the wafer by the Chemical Vapor Deposition (CVD) process, that exploits Fault Detection and Classification (FDC) data is presented. In this tool a new clustering algorithm based on Information Theory (IT) elements have been proposed. In addition, the Least Angle Regression (LARS) algorithm has been applied for the first time to VM problems. - A new VM module for multi-step (CVD, Etching and Litography) line is proposed, where Multi-Task Learning techniques have been employed. - A new Machine Learning algorithm based on Kernel Methods for the estimation of scalar outputs from time series inputs is illustrated. - Run-to-Run control algorithms that employ both the presence of physical measures and statistical ones (coming from a VM system) is shown; this tool is based on IT elements. - A PdM module based on filtering and prediction techniques (Kalman Filter, Monte Carlo methods) is developed for the prediction of maintenance interventions in the Epitaxy process. - A PdM system based on Elastic Nets for the maintenance predictions in Ion Implantation tool is described. Several of the aforementioned works have been developed in collaborations with major European semiconductor companies in the framework of the European project UE FP7 IMPROVE (Implementing Manufacturing science solutions to increase equiPment pROductiVity and fab pErformance); such collaborations will be specified during the thesis, underlying the practical aspects of the implementation of the proposed technologies in a real industrial environment

    Bayesian techniques for discrete stochastic volatility models

    Get PDF
    This thesis was submitted for the degree of Master of Philosophy and awarded by Brunel University.Reliable volatility forecasts are needed in many areas of nance, be it option pricing, risk management or portfolio allocation. Mathematical models that capture temporal dependencies between the returns of nancial assets and their volatility and could be used for volatility forecasting generally fall into one of the following categories: historical volatility models, GARCH { type models and Stochastic Volatility (SV) models. This thesis will focus on the predictive ability of the discrete version of SV models. Six variants of discrete SV models will be estimated: classic SV model, SV model with innovations having Student distribution, SV model with Gaussian innovations augmented with lag one trading volume, SV model with t-innovations augmented with lag one trading volume, SV model with Gaussian innovations augmented with lag two trading volume, SV model with t-innovations augmented with lag two trading volume. These models will be compared on the basis of their ability to predict volatility. Our study will show that SV model specification with Student t distribution with 3 degrees of freedom leads to a significant improvement in volatility forecasts, thus demonstrating good agreement with the empirical fact that financial returns have fat-tailed distribution. It will be shown that the influence of the trading volume is very small compared with the impact of different distributional assumptions on innovations.Saratov State University through the programme "Innovative University"

    Bayesian temporal and spatio-temporal Markov switching models for the detection of influenza outbreaks

    Get PDF
    Influenza is a disease which affects millions of people every year and causes hundreds of thousends of deads every year. This disease causes substantial direct and indirect costs every year. The influenza epidemic have a particular behavior which shapes the statistical methods for their detection. Seasonal epidemics happen virtually every year in the temperate parts of the globe during the cold months and extend throughout whole regions, countries and even continents. Besides the seasonal epidemics, some nonseasonal epidemics can be observed at unexpected times, usually caused by strains which jump the barrier between animals and humans, as happened with the well known Swine Flu epidemic, which caused great alarm in 2009. Several statistical methods have been proposed for the detection of outbreaks of diseases and, in particular, for influenza outbreaks. A reduced version of the review present in this thesis has been published in REVSTAT-Statistical Journal by Amorós et al. in 2015. An interesting tool for the modeling of statistical methods for the detection of influenza outbreaks is the use of Markov switching models, where latent variables are paired with the observations, indicating the epidemic or endemic phase. Two different models are applied to the data according to the value of the latent variable. The latent variables are temporally linked through a Markov chain. The observations are also conditionally dependent on their temporal or spatio-temporal neighbors. Models using this tool can offer a probability of being in epidemic as an outcome instead of just a ‘yes’ or ‘no’. Bayesian paradigm offers an interesting framework where the outcomes can be interpreted as probability distributions. Also, inference can be done over complex hierarchical models, as usually the Markov switching models are. This research offer two extensions of the model proposed by Martinez-Beneito et al. in 2008, published in Statistics in Medicine. The first proposal is a framework of Poison Markov switching models over the counts. This proposal has been published in Statistical Methods in Medical Research by Conesa et al. in 2015. In this proposal, the counts are modeled through a Poisson distribution, and the mean of these counts is related to the rates through the population. Then, the rates are modeled through a Normal distribution. The the mean and variance of the rates depend on whether we are in the epidemic or nonepidemic phase for each week. The latent variables which determine the epidemic phase are modeled through a hidden Markov chain. The mean and the variance on the epidemic phase is considered to be larger than the ones on the endemic phase. Different degrees of temporal dependency of the mean of the data can be defined. A first option is be to consider the rates conditionally independent. A second option is to consider that every observation is conditionally dependent on the previous observation through an autoregressive process of order 1. Higher orders of dependency can be defined, but we limited our framework of models to an autoregressive process of order 2 to avoid unnecessary complexity, as no big changes in the outcome were appreciated using higher orders of autocorrelation. The application of this framework of methods over several data bases showed that this proposal outperforms other methodologies present in the literature. It also stresses several difficulties in the process of evaluation of statistical methods for the detection of influenza outbreaks. The second proposal of this research is a spatio-temporal Markov switching model over the differentiated rates, which are considered to follow a normal distribution, with mean and variance parameters dependent on the epidemic state. The latent variables are modeled in the same way as in the temporal proposal, but having one conditionally independent hidden Markov chain for each of the locations. The variance of the endemic phase is also considered to be lower than that of the epidemic phase. Three components are defined for the mean of the differentiated rates: First of all, a common term for all the regions for each time is set in both the endemic and epidemic mean. These terms are defined as two random effects, with mean zero and a higher variance for the epidemic phase. The variances of these random effects are linked to those of the likelihood to avoid problems of identifiability. An autoregressive term for each location is also defined for the epidemic term, as it is expected that from the begining of the epidemic until the peak we observe similar positive jumps and from the peak to the end of the epidemic we observe similar negative jumps. An intrinsic CAR structure is also defined for the epidemic mean, considering that the epidemic can spread to neighbor regions which will have similar epidemic increases of the rates. This proposal has been applied over the United States Google Flu Trends data from 2007 to 2013 for the 48 spatially connected states plus Washington D.C. The comparison of the model with several simplifications and variations has stressed the necessity of several of the assumptions made during the modeling process
    corecore