15 research outputs found

    Robust Linear Discriminant Analysis with Highest Breakdown Point Estimator

    Get PDF
    Linear Discriminant Analysis (LDA) is a supervised classification technique concerned with the relationship between a categorical variable and a set of interrelated variables.The main objective of LDA is to create a rule to distinguish between populations and allocating future observations to previously defined populations.The LDA yields optimal discriminant rule between two or more groups under the assumptions of normality and homoscedasticity.Nevertheless, the classical estimates, sample mean and sample covariance matrix, are highly affected when the ideal conditions are violated.To abate these problems, a new robust LDA rule using high breakdown point estimators has been proposed in this article.A winsorized approach used to estimate the location measure while the multiplication of Spearman’s rho and the rescaled median absolute deviation were used to estimate the scatter measure to replace the sample mean and sample covariance matrix, respectively.Simulation and real data study were conducted to evaluate the performance of the proposed model measured in terms of misclassification error rates.The computational results showed that the proposed LDA is always better than the classical LDA and were comparable with the existing robust LDAs

    The Mahalanobis-Taguchi system based on statistical modeling

    Get PDF
    早大学位記番号:新7809早稲田大

    Bivariate modified hotelling’s T2 charts using bootstrap data

    Get PDF
    The conventional Hotelling’s  charts are evidently inefficient as it has resulted in disorganized data with outliers, and therefore, this study proposed the application of a novel alternative robust Hotelling’s  charts approach. For the robust scale estimator , this approach encompasses the use of the Hodges-Lehmann vector and the covariance matrix in place of the arithmetic mean vector and the covariance matrix, respectively.  The proposed chart was examined performance wise. For the purpose, simulated bivariate bootstrap datasets were used in two conditions, namely independent variables and dependent variables. Then, assessment was made to the modified chart in terms of its robustness. For the purpose, the likelihood of outliers’ detection and false alarms were computed. From the outcomes from the computations made, the proposed charts demonstrated superiority over the conventional ones for all the cases tested

    H-statistic with winsorized modified one-step M-estimator as central tendency measure

    Get PDF
    Two-sample independent t-test and ANOVA are classical procedures which are widely used to test the equality of two groups and more than two groups respectively. However, these parametric procedures are easily affected by non-normality, becoming more obvious when heterogeneity of variances and unbalanced group sizes exist. It is well known that the violation in the assumption of the tests will lead to inflation in Type I error rate and decreasing in the power of test. Nonparametric procedures like Mann-Whitney and Kruskal-Wallis may be the alternative to the parametric procedures, however, loss of information occur due to the ranking data. In mitigating these problems, robust procedures can be used as the other alternative. One of the procedures is H-statistic. When used with modified one-step M-estimator (MOM), the test statistic (MOM-H) produces good control of Type I error rate even under small sample size but inconsistent under certain conditions investigated. Furthermore, power of test is low which might be due to the trimming process. In this study, MOM was winsorized (WMOM) to retain the original sample size. The Hstatistic when combines with WMOM as the central tendency measure (WMOM-H) shows better control of Type I error rate as compared to MOM-H especially under balanced design regardless of the shape of distributions. It also performs well under highly skewed and heavy tailed distribution for unbalanced design. On top of that, WMOM-H also generates better power value, as compared to MOM-H and ANOVA under most of the conditions investigated. WMOM-H also has better control of Type I error rates with no liberal value (>0.075) compared to the parametric (t-test and ANOVA) and nonparametric (Mann-Whitney and Kruskal-Wallis) procedures. In general, this study demonstrates that winsorization process (WMOM) is able to improve the performance of H-statistic in terms of controlling Type I error rate and increasing power of test

    ECONOMIC PERSPECTIVE ON ALGORITHM SELECTION FOR PREDICTIVE MAINTENANCE

    Get PDF
    The increasing availability of data and computing capacity drives optimization potential. In the industrial context, predictive maintenance is particularly promising and various algorithms are available for implementation. For the evaluation and selection of predictive maintenance algorithms, hitherto, statistical measures such as absolute and relative prediction errors are considered. However, algorithm selection from a purely statistical perspective may not necessarily lead to the optimal economic outcome as the two types of prediction errors (i.e., alpha error ignoring system failures versus beta error falsely indicating system failures) are negatively correlated, thus, cannot be jointly optimized and are associated with different costs. Therefore, we compare the prediction performance of three types of algorithms from an economic perspective, namely Artificial Neural Networks, Support Vector Machines, and Hotelling T² Control Charts. We show that the translation of statistical measures into a single cost-based objective function allows optimizing the individual algorithm parametrization as well as the un-ambiguous comparison among algorithms. In a real-life scenario of an industrial full-service provider we derive cost advantages of more than 17% compared to an algorithm selection based on purely statistical measures. This work contributes to the theoretical and practical knowledge on predictive maintenance algorithms and supports predictive maintenance investment decisions

    Model-based performance monitoring of batch processes

    Get PDF
    The use of batch processes is widespread across the manufacturing industries, dominating sectors such as pharmaceuticals, speciality chemicals and biochemicals. The main goal in batch production is to manufacture consistent, high quality batches with minimum rework or spoilage and also to achieve the optimum energy and feedstock usage. A common approach to monitoring a batch process to achieve this goal is to use a recipe-driven approach coupled with off-line laboratory analysis of the product. However, the large amount of data generated during batch manufacture mean that it is possible to monitor batch processes using a statistical model. Traditional multivariate statistical techniques such as principal component analysis and partial least squares were originally developed for use on continuous processes, which means they are less able to cope with the non-linear and dynamic behaviours inherent within a batch process without being adapted. Several approaches to dealing with batch behaviour in a multivariate framework have been proposed including multi-way principal component analysis. A more advanced approach designed to handle the typical characteristics of batch data is that of model-based principal component. It comprises of a mechanistic model combined with a multivariate statistical technique. More specifically, the technique uses a mechanistic model of the process to generate a set of residuals from the measured process variables. The theory being that the non-linear behaviour and the serial correlation in the process will be captured by the model, leaving a set of unstructured residuals to which principal component analysis (PCA) can be applied. This approach is benchmarked against the more standard approaches including multiway principal components analysis, batch observation level analysis. One limitation identified of the model-based approach is that if the mechanistic model of the process is of reduced complexity then the monitoring and fault detection abilities of the technique will be compromised. To address this issue, the model-based PCA technique has been extended to incorporate an additional error model which captures the differences between the mechanistic model and the process. This approach has been termed super model-based PCA (SMBPCA). A number of different error models are considered including partial least squares (linear, non-linear and dynamic), autoregressive with exogenous (ARX) variables model and dynamic canonical correlation analysis. Through the use of an exothermic batch reactor simulation, the SMBPCA approach has been investigated with respect to fault detection and capturing the non-linear and dynamic behaviour in the batch process. The robustness of the technique for application in an industrial situation is also discussed.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Políticas de amostragem em controlo estatístico da qualidade

    Get PDF
    A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Statistics and EconometricsNesta Dissertação apresentam-se e estudam-se, de uma forma crítica, dois novos métodos de amostragem adaptativa e uma nova medida de desempenho de métodos de amostragem, no contexto do controlo estatístico da qualidade. Considerando como base uma carta de controlo para a média do tipo Shewhart, estudamos as suas propriedades estatísticas e realizamos estudos comparativos, em termos do seu desempenho estatístico, com alguns dos métodos mais referenciados na literatura.Inicialmente, desenvolvemos um novo método adaptativo de amostragem no qual os intervalos entre amostras são obtidos com base na função densidade da distribuição de Laplace reduzida. Este método revela-se, particularmente, eficiente na deteção de moderadas e grandes alterações da média, pouco sensível à limitação do menor intervalo de amostragem e robusto face a diferentes situações consideradas para a não normalidade da característica da qualidade. Em determinadas situações, este método é sempre mais eficiente do que o método com intervalos de amostragem adaptativos,dimensões amostrais fixas e coeficientes dos limites de controlo fixos. Tendo como base o método de amostragem definido no ponto anterior e um método no qual os intervalos de amostragem são definidos antes do início do controlo do processo com base na taxa cumulativa de risco do sistema, apresentamos um novo método de amostragem que combina o método de intervalos predefinidos com o método de intervalos adaptativos. Neste método, os instantes de amostragem são definidos pela média ponderada dos instantes dos dois métodos, atribuindo-se maior peso ao método adaptativo para alterações moderadas (onde o método predefinido é menos eficaz) e maior peso ao método predefinido nos restantes casos (onde o método adaptativo é menos eficaz). Desta forma, os instantes de amostragem, inicialmente calendarizados de acordo com as expectativas de ocorrência de uma alteração tomando como base a distribuição do tempo de vida do sistema, são adaptados em função do valor da estatística amostral calculada no instante anterior. Este método é sempre mais eficiente do que o método periódico clássico, o que não acontece com nenhum outro esquema adaptativo, e do que o método de amostragem VSI para alguns pares de amostragem, posicionando-se como uma forte alternativa aos procedimentos de amostragem encontrados na literatura. Por fim, apresentamos uma nova medida de desempenho de métodos de amostragem. Considerando que dois métodos em comparação têm o mesmo tempo médio de mau funcionamento, o desempenho dos métodos é comparado através do número médio de amostras recolhidas sob controlo. Tendo em conta o tempo de vida do sistema, com diferentes taxas de risco, esta medida mostra-se robusta e permite, num contexto económico, um melhor controlo de custos por unidade de tempo

    Control Multivariante Estadístico de Variables Discretas tipo Poisson

    Full text link
    En algunos casos, cuando el número de defectos de un proceso de producción tiene que ser controlada, la distribución de Poisson se emplea para modelar la frecuencia de estos defectos y para desarrollar un gráfico de control. En este trabajo se analiza el control de características de calidad p> 1 de Poisson . Cuando este control se necesita, hay dos enfoques principales: 1 - Un gráfico para cada variable de Poisson, el esquema múltiple.. 2 -. Sólo una gráfico para todas las variables, el sistema multivariable. En este trabajo se desarrolla un nuevo gráfico de control multivariable basado en la combinación lineal de las variables de Poisson, donde esta combinación lineal es optimizada con el fin de mantener un ARL bajo control deseado y de minimizar el ARL fuera de control. Esta optimización se lleva a cabo utilizando un software bajo Windows ©, que también hace una comparación de rendimiento entre este gráfico y otros sistemas para monitorear una serie de variables Poisson. En los otros sistemas se incluye la suma de las variables (gráfico MP) y un conjunto optimizado de gráficos univariados Poisson (esquema múltiple).García Bustos, SL. (2014). Control Multivariante Estadístico de Variables Discretas tipo Poisson [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/40592TESI

    Robust linear discriminant rules with coordinatewise and distance based approaches

    Get PDF
    Linear discriminant analysis (LDA) is one of the supervised classification techniques to deal with relationship between a categorical variable and a set of continuous variables. The main objective of LDA is to create a function to distinguish between groups and allocating future observations to previously defined groups. Under the assumptions of normality and homoscedasticity, the LDA yields optimal linear discriminant rule (LDR) between two or more groups. However, the optimality of LDA highly relies on the sample mean and sample covariance matrix which are known to be sensitive to outliers. To abate these conflicts, robust location and scale estimators via coordinatewise and distance based approaches have been applied in constructing new robust LDA. These robust estimators were used to replace the classical sample mean and sample covariance to form robust linear discriminant rules (RLDR). A total of six RLDR, namely four coordinatewise (RLDRM, RLDRMw, RLDRW, RLDRWw) and two distance based (RLDRV, RLDRT) approaches have been proposed and implemented in this study. Simulation and real data study were conducted to investigate on the performance of the proposed RLDR, measured in terms of misclassification error rates and computational time. Several data conditions such as non-normality, heteroscedasticity, balanced and unbalanced data set were manipulated in the simulation study to evaluate the performance of these proposed RLDR. In real data study, a set of diabetes data was used. This data set violated the assumptions of normality as well as homoscedasticity. The results showed that the novel RLDRV is the best proposed RLDR to solve classification problem since it provides as much as 91.03% accuracy in classification as shown in the real data study. The proposed RLDR are good alternatives to the classical LDR as well as existing RLDR since these RLDR perform well in classification problems even under contaminated data
    corecore