2,389 research outputs found

    On monitoring of multiple non-linear profiles

    Get PDF
    Most state-of-the-art profile monitoring methods involve studies of one profile. However, a process may contain several sensors or probes that generate multiple profiles over time. Quality characteristics presented in multiple profiles may be related multiple aspects of product or process quality. Existing charting methods for simultaneous monitoring of each multiple profile may result in high false alarm rates. Or worse, they cannot correctly detect potential relationship changes among profiles. In this study, we propose two approaches to detect process shifts in multiple non-linear profiles. A simulation study was conducted to evaluate the performance of the proposed approaches in terms of average run length under different process shift scenarios. Pros and cons of the proposed methods are discussed. A guideline for choosing the proposed methods is introduced. In addition, a hybrid method combining the salient points of both approaches is explored. Finally, a real-world data-set from a vulcanisation process is used to demonstrate the implementation of the proposed methods

    A DATA ANALYTICAL FRAMEWORK FOR IMPROVING REAL-TIME, DECISION SUPPORT SYSTEMS IN HEALTHCARE

    Get PDF
    In this dissertation we develop a framework that combines data mining, statistics and operations research methods for improving real-time decision support systems in healthcare. Our approach consists of three main concepts: data gathering and preprocessing, modeling, and deployment. We introduce the notion of offline and semi-offline modeling to differentiate between models that are based on known baseline behavior and those based on a baseline with missing information. We apply and illustrate the framework in the context of two important healthcare contexts: biosurveillance and kidney allocation. In the biosurveillance context, we address the problem of early detection of disease outbreaks. We discuss integer programming-based univariate monitoring and statistical and operations research-based multivariate monitoring approaches. We assess method performance on authentic biosurveillance data. In the kidney allocation context, we present a two-phase model that combines an integer programming-based learning phase and a data-analytical based real-time phase. We examine and evaluate our method on the current Organ Procurement and Transplantation Network (OPTN) waiting list. In both contexts, we show that our framework produces significant improvements over existing methods

    Integrated Projection and Regression Models for Monitoring Multivariate Autocorrelated Cascade Processes

    Get PDF
    This dissertation presents a comprehensive methodology of dual monitoring for the multivariate autocorrelated cascade processes using principal component analysis and regression. Principle Components Analysis is used to alleviate the multicollinearity among input process variables and reduce the dimension of the variables. An integrated principal components selection rule is proposed to reduce the number of input variables. An autoregressive time series model is used and imposed on the time correlated output variable which depends on many multicorrelated process input variables. A generalized least squares principal component regression is used to describe the relationship between product and process variables under the autoregressive regression error model. The combined residual based EWMA control chart, applied to the product characteristics, and the MEWMA control charts applied to the multivariate autocorrelated cascade process characteristics, are proposed. The dual EWMA and MEWMA control chart has advantage and capability over the conventional residual type control chart applied to the residuals of the principal component regression by monitoring both product and the process characteristics simultaneously. The EWMA control chart is used to increase the detection performance, especially in the case of small mean shifts. The MEWMA is applied to the selected set of variables from the first principal component with the aim of increasing the sensitivity in detecting process failures. The dual implementation control chart for product and process characteristics enhances both the detection and the prediction performance of the monitoring system of the multivariate autocorrelated cascade processes. The proposed methodology is demonstrated through an example of the sugar-beet pulp drying process. A general guideline for controlling multivariate autocorrelated processes is also developed

    Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation

    Full text link
    The present Ph.D. thesis, primarily conceived to support and reinforce the relation between academic and industrial worlds, was developed in collaboration with Shell Global Solutions (Amsterdam, The Netherlands) in the endeavour of applying and possibly extending well-established latent variable-based approaches (i.e. Principal Component Analysis - PCA - Partial Least Squares regression - PLS - or Partial Least Squares Discriminant Analysis - PLSDA) for complex problem solving not only in the fields of manufacturing troubleshooting and optimisation, but also in the wider environment of multivariate data analysis. To this end, novel efficient algorithmic solutions are proposed throughout all chapters to address very disparate tasks, from calibration transfer in spectroscopy to real-time modelling of streaming flows of data. The manuscript is divided into the following six parts, focused on various topics of interest: Part I - Preface, where an overview of this research work, its main aims and justification is given together with a brief introduction on PCA, PLS and PLSDA; Part II - On kernel-based extensions of PCA, PLS and PLSDA, where the potential of kernel techniques, possibly coupled to specific variants of the recently rediscovered pseudo-sample projection, formulated by the English statistician John C. Gower, is explored and their performance compared to that of more classical methodologies in four different applications scenarios: segmentation of Red-Green-Blue (RGB) images, discrimination of on-/off-specification batch runs, monitoring of batch processes and analysis of mixture designs of experiments; Part III - On the selection of the number of factors in PCA by permutation testing, where an extensive guideline on how to accomplish the selection of PCA components by permutation testing is provided through the comprehensive illustration of an original algorithmic procedure implemented for such a purpose; Part IV - On modelling common and distinctive sources of variability in multi-set data analysis, where several practical aspects of two-block common and distinctive component analysis (carried out by methods like Simultaneous Component Analysis - SCA - DIStinctive and COmmon Simultaneous Component Analysis - DISCO-SCA - Adapted Generalised Singular Value Decomposition - Adapted GSVD - ECO-POWER, Canonical Correlation Analysis - CCA - and 2-block Orthogonal Projections to Latent Structures - O2PLS) are discussed, a new computational strategy for determining the number of common factors underlying two data matrices sharing the same row- or column-dimension is described, and two innovative approaches for calibration transfer between near-infrared spectrometers are presented; Part V - On the on-the-fly processing and modelling of continuous high-dimensional data streams, where a novel software system for rational handling of multi-channel measurements recorded in real time, the On-The-Fly Processing (OTFP) tool, is designed; Part VI - Epilogue, where final conclusions are drawn, future perspectives are delineated, and annexes are included.La presente tesis doctoral, concebida principalmente para apoyar y reforzar la relación entre la academia y la industria, se desarrolló en colaboración con Shell Global Solutions (Amsterdam, Países Bajos) en el esfuerzo de aplicar y posiblemente extender los enfoques ya consolidados basados en variables latentes (es decir, Análisis de Componentes Principales - PCA - Regresión en Mínimos Cuadrados Parciales - PLS - o PLS discriminante - PLSDA) para la resolución de problemas complejos no sólo en los campos de mejora y optimización de procesos, sino también en el entorno más amplio del análisis de datos multivariados. Con este fin, en todos los capítulos proponemos nuevas soluciones algorítmicas eficientes para abordar tareas dispares, desde la transferencia de calibración en espectroscopia hasta el modelado en tiempo real de flujos de datos. El manuscrito se divide en las seis partes siguientes, centradas en diversos temas de interés: Parte I - Prefacio, donde presentamos un resumen de este trabajo de investigación, damos sus principales objetivos y justificaciones junto con una breve introducción sobre PCA, PLS y PLSDA; Parte II - Sobre las extensiones basadas en kernels de PCA, PLS y PLSDA, donde presentamos el potencial de las técnicas de kernel, eventualmente acopladas a variantes específicas de la recién redescubierta proyección de pseudo-muestras, formulada por el estadista inglés John C. Gower, y comparamos su rendimiento respecto a metodologías más clásicas en cuatro aplicaciones a escenarios diferentes: segmentación de imágenes Rojo-Verde-Azul (RGB), discriminación y monitorización de procesos por lotes y análisis de diseños de experimentos de mezclas; Parte III - Sobre la selección del número de factores en el PCA por pruebas de permutación, donde aportamos una guía extensa sobre cómo conseguir la selección de componentes de PCA mediante pruebas de permutación y una ilustración completa de un procedimiento algorítmico original implementado para tal fin; Parte IV - Sobre la modelización de fuentes de variabilidad común y distintiva en el análisis de datos multi-conjunto, donde discutimos varios aspectos prácticos del análisis de componentes comunes y distintivos de dos bloques de datos (realizado por métodos como el Análisis Simultáneo de Componentes - SCA - Análisis Simultáneo de Componentes Distintivos y Comunes - DISCO-SCA - Descomposición Adaptada Generalizada de Valores Singulares - Adapted GSVD - ECO-POWER, Análisis de Correlaciones Canónicas - CCA - y Proyecciones Ortogonales de 2 conjuntos a Estructuras Latentes - O2PLS). Presentamos a su vez una nueva estrategia computacional para determinar el número de factores comunes subyacentes a dos matrices de datos que comparten la misma dimensión de fila o columna y dos planteamientos novedosos para la transferencia de calibración entre espectrómetros de infrarrojo cercano; Parte V - Sobre el procesamiento y la modelización en tiempo real de flujos de datos de alta dimensión, donde diseñamos la herramienta de Procesamiento en Tiempo Real (OTFP), un nuevo sistema de manejo racional de mediciones multi-canal registradas en tiempo real; Parte VI - Epílogo, donde presentamos las conclusiones finales, delimitamos las perspectivas futuras, e incluimos los anexos.La present tesi doctoral, concebuda principalment per a recolzar i reforçar la relació entre l'acadèmia i la indústria, es va desenvolupar en col·laboració amb Shell Global Solutions (Amsterdam, Països Baixos) amb l'esforç d'aplicar i possiblement estendre els enfocaments ja consolidats basats en variables latents (és a dir, Anàlisi de Components Principals - PCA - Regressió en Mínims Quadrats Parcials - PLS - o PLS discriminant - PLSDA) per a la resolució de problemes complexos no solament en els camps de la millora i optimització de processos, sinó també en l'entorn més ampli de l'anàlisi de dades multivariades. A aquest efecte, en tots els capítols proposem noves solucions algorítmiques eficients per a abordar tasques dispars, des de la transferència de calibratge en espectroscopia fins al modelatge en temps real de fluxos de dades. El manuscrit es divideix en les sis parts següents, centrades en diversos temes d'interès: Part I - Prefaci, on presentem un resum d'aquest treball de recerca, es donen els seus principals objectius i justificacions juntament amb una breu introducció sobre PCA, PLS i PLSDA; Part II - Sobre les extensions basades en kernels de PCA, PLS i PLSDA, on presentem el potencial de les tècniques de kernel, eventualment acoblades a variants específiques de la recentment redescoberta projecció de pseudo-mostres, formulada per l'estadista anglés John C. Gower, i comparem el seu rendiment respecte a metodologies més clàssiques en quatre aplicacions a escenaris diferents: segmentació d'imatges Roig-Verd-Blau (RGB), discriminació i monitorització de processos per lots i anàlisi de dissenys d'experiments de mescles; Part III - Sobre la selecció del nombre de factors en el PCA per proves de permutació, on aportem una guia extensa sobre com aconseguir la selecció de components de PCA a través de proves de permutació i una il·lustració completa d'un procediment algorítmic original implementat per a la finalitat esmentada; Part IV - Sobre la modelització de fonts de variabilitat comuna i distintiva en l'anàlisi de dades multi-conjunt, on discutim diversos aspectes pràctics de l'anàlisis de components comuns i distintius de dos blocs de dades (realitzat per mètodes com l'Anàlisi Simultània de Components - SCA - Anàlisi Simultània de Components Distintius i Comuns - DISCO-SCA - Descomposició Adaptada Generalitzada en Valors Singulars - Adapted GSVD - ECO-POWER, Anàlisi de Correlacions Canòniques - CCA - i Projeccions Ortogonals de 2 blocs a Estructures Latents - O2PLS). Presentem al mateix temps una nova estratègia computacional per a determinar el nombre de factors comuns subjacents a dues matrius de dades que comparteixen la mateixa dimensió de fila o columna, i dos plantejaments nous per a la transferència de calibratge entre espectròmetres d'infraroig proper; Part V - Sobre el processament i la modelització en temps real de fluxos de dades d'alta dimensió, on dissenyem l'eina de Processament en Temps Real (OTFP), un nou sistema de tractament racional de mesures multi-canal registrades en temps real; Part VI - Epíleg, on presentem les conclusions finals, delimitem les perspectives futures, i incloem annexos.Vitale, R. (2017). Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90442TESI

    Estimating change point in multivariate processes via simultaneous mean vector and covariance matrix

    Get PDF
    In many industrial processes, several quality characteristics are inevitably related. In this situation, the mean vector and covariance matrix must be simultaneously monitored and controlled to determine whether a multivariate process is in control. With the increase in the number of variables, the performance of control charts is significantly reduced, and the time delay between the actual time of change in the process and the warning time of the control chart increases, which is one of the main challenges when using multivariable control charts. Between the real-time and the change time (called the change-point - CP), especially during the simultaneous monitoring and controlling of the parameters, the mean vector, and the covariance matrix cause problems such as delay or stoppage of the production lines or services, as well as inconsistent production of products or services. To improve this, a new way of estimating the CP will help statistical process control (SPC) professionals identify the cause(s) of out-of-control (OC) conditions, thus providing better feedback for process improvement. This study presented a new method based on an artificial neural network (ANN), which first examined the OC conditions for a multivariate process using the multivariate exponentially weighted moving average (MEWMA) and multivariate exponentially weighted mean square (MEWMS) control charts. Then, the ANN-fitting method was used to diagnose the cause(s) of OC conditions using the machine learning (ML)-classifier and estimating the length of delay time. Finally, the change point (CP) was estimated by integrating all these methods. The performance of the new approach was validated by comparing it with the results from another study. It also validated the proposed method developed by evaluating the accuracy and precision of this research. As a conclusion, the MEWMS chart was the best for detecting the OC condition while the support vector machines (SVM) gaussian model best to diagnoses the cause(s) o f the OC condition. The model provided has estimated the change point on one sample with difference over 10,000 tested cases (simulated) with a probability of 99%, which is an accurate and reliable model for a practical approach

    funcharts: Control charts for multivariate functional data in R

    Full text link
    Modern statistical process monitoring (SPM) applications focus on profile monitoring, i.e., the monitoring of process quality characteristics that can be modeled as profiles, also known as functional data. Despite the large interest in the profile monitoring literature, there is still a lack of software to facilitate its practical application. This article introduces the funcharts R package that implements recent developments on the SPM of multivariate functional quality characteristics, possibly adjusted by the influence of additional variables, referred to as covariates. The package also implements the real-time version of all control charting procedures to monitor profiles partially observed up to an intermediate domain point. The package is illustrated both through its built-in data generator and a real-case study on the SPM of Ro-Pax ship CO2 emissions during navigation, which is based on the ShipNavigation data provided in the Supplementary Material

    Univariate and multivariate linear profiles using max type extended exponentially weighted moving average schemes

    Get PDF
    Many studies have shown that industrial as well as non-industrial business organizations present a growing need of robust and more efficient multivariate monitoring schemes in order to be able to monitor several quality characteristics simultaneous. To monitor two or more parameters simultaneously, several monitoring schemes are used concurrently in most of the cases instead of using a single scheme. Thus, in this paper, the exponentially weighted moving average (EWMA), double EWMA (DEWMA) and the recent triple EWMA (TEWMA) procedures are used to develop new single univariate and multivariate Max-type monitoring schemes for linear profiles under the assumptions of fixed and random linear models to monitor the regression parameters and variance error simultaneously. It is observed that the newly proposed schemes are better alternatives of the classical univariate and multivariate EWMA, DEWMA and TEWMA schemes for linear profiles in terms of the average run-length (ARL) and expected ARL profiles. Numerical examples are presented using simulated and real-life data.https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639Statistic
    corecore