1,016 research outputs found

    Parametric, Nonparametric, and Semiparametric Linear Regression in Classical and Bayesian Statistical Quality Control

    Get PDF
    Statistical process control (SPC) is used in many fields to understand and monitor desired processes, such as manufacturing, public health, and network traffic. SPC is categorized into two phases; in Phase I historical data is used to inform parameter estimates for a statistical model and Phase II implements this statistical model to monitor a live ongoing process. Within both phases, profile monitoring is a method to understand the functional relationship between response and explanatory variables by estimating and tracking its parameters. In profile monitoring, control charts are often used as graphical tools to visually observe process behaviors. We construct a practitioner’s guide to provide a stepby- step application for parametric, nonparametric, and semiparametric methods in profile monitoring, creating an in-depth guideline for novice practitioners. We then consider the commonly used cumulative sum (CUSUM), multivariate CUSUM (mCUSUM), exponentially weighted moving average (EWMA), multivariate EWMA (mEWMA) charts under a Bayesian framework for monitoring respiratory disease related hospitalizations and global suicide rates with parametric, nonparametric, and semiparametric linear models

    Statistical Methodologies of Functional Data Analysis for Industrial Applications

    Get PDF
    This thesis stands as one of the first attempt to connect the statistical object oriented data analysis (OODA) methodologies with the industry field. Indeed, the aim of this thesis is to develop statistical methods to tackle industrial problems through the paradigm of the OODA. The new framework of Industry 4.0 requires factories that are equipped with sensor and advanced acquisition systems that acquire data with a high degree of complexity. OODA can be particularly suitable to deal with this increasing complexity as it considers each statistical unit as an atom or a data object assumed to be a point in a well-defined mathematical space. This idea allows one to deal with complex data structure by changing the resolution of the analysis. Indeed, from standard methods where the atom is represented by vector of numbers, the focus now is on methodologies where the objects of the analysis are whole complex objects. In particular, this thesis focuses on functional data analysis (FDA), a branch of OODA that considers as the atom of the analysis functions defined on compact domains. The cross-fertilization of FDA methods to industrial applications is developed into three parts in this dissertation. The first part presents methodologies developed to solve specific applicative problems. In particular, a first consistent portion of this part is focused on \textit{profile monitoring} methods applied to ship CO\textsubscript{2} emissions. A second portion deals with the problem of predicting the mechanical properties of an additively manufactured artifact given the particle size distribution of the powder used for its production. And, a third portion copes with the cluster analysis for the quality assessment of metal sheet spot welds in the automotive industry based on observations of dynamic resistance curve. Stimulated by these challenges, the second part of this dissertation turns towards a more methodological line that addresses the notion of \textit{interpretability} for functional data. In particular, two new interpretable estimators of the coefficient function of the function-on-function linear regression model are proposed, which are named S-LASSO and AdaSS, respectively. Moreover, a new method, referred to as SaS-Funclust, is presented for sparse clustering of functional data that aims to classify a sample of curves into homogeneous groups while jointly detecting the most informative portions of domain. In the last part, two ongoing researches on FDA methods for industrial application are presented. In particular, the first one regards the definition of a new robust nonparametric functional ANOVA method (Ro-FANOVA) to test differences among group functional means by being robust against the presence of outliers with an application to additive manufacturing. The second one sketches a new methodological framework for the real-time profile monitoring

    An OLS-Based Method for Causal Inference in Observational Studies

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Observational data are frequently used for causal inference of treatment effects on prespecified outcomes. Several widely used causal inference methods have adopted the method of inverse propensity score weighting (IPW) to alleviate the in uence of confounding. However, the IPW-type methods, including the doubly robust methods, are prone to large variation in the estimation of causal e ects due to possible extreme weights. In this research, we developed an ordinary least-squares (OLS)-based causal inference method, which does not involve the inverse weighting of the individual propensity scores. We first considered the scenario of homogeneous treatment effect. We proposed a two-stage estimation procedure, which leads to a model-free estimator of average treatment effect (ATE). At the first stage, two summary scores, the propensity and mean scores, are estimated nonparametrically using regression splines. The targeted ATE is obtained as a plug-in estimator that has a closed form expression. Our simulation studies showed that this model-free estimator of ATE is consistent, asymptotically normal and has superior operational characteristics in comparison to the widely used IPW-type methods. We then extended our method to the scenario of heterogeneous treatment effects, by adding in an additional stage of modeling the covariate-specific treatment effect function nonparametrically while maintaining the model-free feature, and the simplicity of OLS-based estimation. The estimated covariate-specific function serves as an intermediate step in the estimation of ATE and thus can be utilized to study the treatment effect heterogeneity. We discussed ways of using advanced machine learning techniques in the proposed method to accommodate high dimensional covariates. We applied the proposed method to a case study evaluating the effect of early combination of biologic & non-biologic disease-modifying antirheumatic drugs (DMARDs) compared to step-up treatment plan in children with newly onset of juvenile idiopathic arthritis disease (JIA). The proposed method gives strong evidence of significant effect of early combination at 0:05 level. On average early aggressive use of biologic DMARDs leads to around 1:2 to 1:7 more reduction in clinical juvenile disease activity score at 6-month than the step-up plan for treating JIA

    Methods for non-proportional hazards in clinical trials: A systematic review

    Full text link
    For the analysis of time-to-event data, frequently used methods such as the log-rank test or the Cox proportional hazards model are based on the proportional hazards assumption, which is often debatable. Although a wide range of parametric and non-parametric methods for non-proportional hazards (NPH) has been proposed, there is no consensus on the best approaches. To close this gap, we conducted a systematic literature search to identify statistical methods and software appropriate under NPH. Our literature search identified 907 abstracts, out of which we included 211 articles, mostly methodological ones. Review articles and applications were less frequently identified. The articles discuss effect measures, effect estimation and regression approaches, hypothesis tests, and sample size calculation approaches, which are often tailored to specific NPH situations. Using a unified notation, we provide an overview of methods available. Furthermore, we derive some guidance from the identified articles. We summarized the contents from the literature review in a concise way in the main text and provide more detailed explanations in the supplement (page 29)

    Studies on semiparametric spatial regression models

    Get PDF
    In this thesis, I study estimations and inferences for semiparametric spatial regression models and generalized geoadditive models (GgAMs). I use the bivariate penalized spline over triangulation (BPST) method in these models to incorporate the spatial information when it is available. There are three topics in the thesis. In the first topic, we try to develop a sparse-partially linear spatial regression model (S\mathcal{S}-PLSM) using a doubly penalized estimator to select and estimate the most significant linear covariates. We apply BPST to approximate a bivariate function over a spatial domain. A standard error formula is constructed to estimate the standard deviation of the estimators, which is tested by simulation studies. We show the consistency of our sparse estimator with asymptotic normality. An application to United States mortality illustrates improvements in estimation and prediction from the use of our estimator relative to other methods. In the second topic, a generalized version of PLSM (GPLSM) is developed to allow a nonlinear link function relating the covariates to the mean of the response variables. This extension allows our method to deal with non-continuous response variables, such as count and binary variables. The iteratively reweighted least square (IRLS) algorithm helps to achieve the computational efficiency of our estimator. The consistency of the proposed estimator is proved with a convergence rate. A standard error formula is developed to construct confidence intervals for the linear estimator. A crash frequency real data analysis demonstrates the accuracy in estimation and prediction for GPLSM. In the last topic, I build an \textsf{R} package, \textbf{GgAM}, which integrates model structure identification process, estimation methods, statistical inference tools of GgAMs together. We develop a semiparametric version of GgAM by adding a linear part into nonparametric GgAMs. This model shares the benefits from univariate splines, bivariate splines and local polynomials. A penalized quasi-likelihood estimator is firstly derived through the IRLS algorithm and then a spline-backfitted local polynomial estimator is obtained. We propose a standard error formula for the parametric estimator in the model as well. Simultaneous confidence bands are developed to measure the accuracy of the univariate spline estimators. A model structure identification process is contained before model fitting to better identify the function form (linearity/nonlinearity) of the continuous covariates. Simulation studies are conducted to show the estimation accuracy and predictive power of our GgAM. The datasets of Georgia education attainment, Sydney housing prices, and Florida crash frequency are included to show the convenient and flexible uses of functions in the \textbf{GgAM} package. In this thesis, I aim to develop computational algorithms to get accurate estimators and propose efficient inference tools to better interpret the results for GgAMs. These tools can be widely used in social, economic, and geographic applications with spatial data to draw perceptive conclusions

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

    MODELING OF QUALITY PROFILE DATA WITH APPLICATION IN MANUFACTURING AND BIOMEDICAL ENGINEERING

    Get PDF
    The quality of the output of a complex system is often recorded as multidimensional profile data with panel structure. In such structure, the quality of each individual in the output is measured repeatedly based on time or other variables. In this dissertation, the quality profile data are modeled to address two types of problems: (a) to explore the underlying relationship between the parameter of interest in the complex system and the resulting quality under the condition that the principal mechanism is not fully known and (b) to quantify the uncertainties among the output. For the first type of problem, we consider a constrained semiparametric varying coefficient model. The system parameter of interest is treated as a covariate whose effect upon the resulting quality is modeled nonparametrically as a function of time. Any existing physicochemical knowledge related to other factors in the system that affect the resulting output quality is modeled parametrically as an additive term in the model. In the situation that expert knowledge about the effect of the parameter is available, some constraints can be incorporated in the model such that the estimated effect aligns with the given knowledge. For the second type of problem, mixed-effect model is developed to quantify the uncertainties among output using random effects. These random effects can be utilized for anomaly detection or for variation quantification where deviation among individuals is of interest depending on the context of the data. Three case studies from manufacturing and biomedical engineering domains are presented in the dissertation where the above two types of problems are discussed

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts
    corecore