124,465 research outputs found
A flexible regression model for count data
Poisson regression is a popular tool for modeling count data and is applied
in a vast array of applications from the social to the physical sciences and
beyond. Real data, however, are often over- or under-dispersed and, thus, not
conducive to Poisson regression. We propose a regression model based on the
Conway--Maxwell-Poisson (COM-Poisson) distribution to address this problem. The
COM-Poisson regression generalizes the well-known Poisson and logistic
regression models, and is suitable for fitting count data with a wide range of
dispersion levels. With a GLM approach that takes advantage of exponential
family properties, we discuss model estimation, inference, diagnostics, and
interpretation, and present a test for determining the need for a COM-Poisson
regression over a standard Poisson regression. We compare the COM-Poisson to
several alternatives and illustrate its advantages and usefulness using three
data sets with varying dispersion.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS306 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Poisson Ridge Regression Estimator
The standard statistical method for analyzing count data is the Poisson regression model, which is usually estimated using maximum likelihood (ML). The ML method is very sensitive to multicollinearity. Therefore, we present a new Poisson ridge regression estimator (PRR) as a remedy to the problem of instability of the traditional ML method. To investigate the performance of the PRR and the traditional ML approaches for estimating the parameters of the Poisson regression model, we calculate the mean squared error (MSE) using Monte Carlo simulations. The result from the simulation study shows that the PRR method outperforms the traditional ML estimator in all of the different situations evaluated in this paper.Poisson regression; maximum likelihood; ridge regression; MSE; Monte Carlo simulations; Multicollinearity
Multiple Approaches to Absenteeism Analysis
Absenteeism research has often been criticized for using inappropriate analysis. Characteristics of absence data, notably that it is usually truncated and skewed, violate assumptions of OLS regression; however, OLS and correlation analysis remain the dominant models of absenteeism research. This piece compares eight models that may be appropriate for analyzing absence data. Specifically, this piece discusses and uses OLS regression, OLS regression with a transformed dependent variable, the Tobit model, Poisson regression, Overdispersed Poisson regression, the Negative Binomial model, Ordinal Logistic regression, and the Ordinal Probit model. A simulation methodology is employed to determine the extent to which each model is likely to produce false positives. Simulations vary with respect to the shape of the dependent variable\u27s distribution, sample size, and the shape of the independent variables\u27 distributions. Actual data,based on a sample of 195 manufacturing employees, is used to illustrate how these models might be used to analyze a real data set. Results from the simulation suggest that, despite methodological expectations, OLS regression does not produce significantly more false positives than expected at various alpha levels. However, the Tobit and Poisson models are often shown to yield too many false positives. A number of other models yield less than the expected number of false positives, thus suggesting that they may serve well as conservative hypothesis tests
Mixture of bivariate Poisson regression models with an application to insurance
In a recent paper Bermúdez [2009] used bivariate Poisson regression models for ratemaking in car insurance, and included zero-inflated models to account for the excess of zeros and the overdispersion in the data set. In the present paper, we revisit this model in order to consider alternatives. We propose a 2-finite mixture of bivariate Poisson regression models to demonstrate that the overdispersion in the data requires more structure if it is to be taken into account, and that a simple zero-inflated bivariate Poisson model does not suffice. At the same time, we show that a finite mixture of bivariate Poisson regression models embraces zero-inflated bivariate Poisson regression models as a special case. Additionally, we describe a model in which the mixing proportions are dependent on covariates when modelling the way in which each individual belongs to a separate cluster. Finally, an EM algorithm is provided in order to ensure the models’ ease-of-fit. These models are applied to the same automobile insurance claims data set as used in Bermúdez [2009] and it is shown that the modelling of the data set can be improved considerably.Zero-inflation, Overdispersion, EM algorithm, Automobile insurance, A priori ratemaking.
Mean-parametrized Conway-Maxwell-Poisson regression models for dispersed counts
Conway-Maxwell-Poisson (CMP) distributions are flexible generalizations of
the Poisson distribution for modelling overdispersed or underdispersed counts.
The main hindrance to their wider use in practice seems to be the inability to
directly model the mean of counts, making them not compatible with nor
comparable to competing count regression models, such as the log-linear
Poisson, negative-binomial or generalized Poisson regression models. This note
illustrates how CMP distributions can be parametrized via the mean, so that
simpler and more easily-interpretable mean-models can be used, such as a
log-linear model. Other link functions are also available, of course. In
addition to establishing attractive theoretical and asymptotic properties of
the proposed model, its good finite-sample performance is exhibited through
various examples and a simulation study based on real datasets. Moreover, the
MATLAB routine to fit the model to data is demonstrated to be up to an order of
magnitude faster than the current software to fit standard CMP models, and over
two orders of magnitude faster than the recently proposed hyper-Poisson model.Comment: To appear in Statistical Modelling: An International Journa
- …