219,549 research outputs found
On the correspondence from Bayesian log-linear modelling to logistic regression modelling with -priors
Consider a set of categorical variables where at least one of them is binary.
The log-linear model that describes the counts in the resulting contingency
table implies a specific logistic regression model, with the binary variable as
the outcome. Within the Bayesian framework, the -prior and mixtures of
-priors are commonly assigned to the parameters of a generalized linear
model. We prove that assigning a -prior (or a mixture of -priors) to the
parameters of a certain log-linear model designates a -prior (or a mixture
of -priors) on the parameters of the corresponding logistic regression. By
deriving an asymptotic result, and with numerical illustrations, we demonstrate
that when a -prior is adopted, this correspondence extends to the posterior
distribution of the model parameters. Thus, it is valid to translate inferences
from fitting a log-linear model to inferences within the logistic regression
framework, with regard to the presence of main effects and interaction terms.Comment: 27 page
Analyzing Temperature Effects on Mortality Within the R Environment: The Constrained Segmented Distributed Lag Parameterization
Here we present and discuss the R package modTempEff including a set of functions aimed at modelling temperature effects on mortality with time series data. The functions fit a particular log linear model which allows to capture the two main features of mortality- temperature relationships: nonlinearity and distributed lag effect. Penalized splines and segmented regression constitute the core of the modelling framework. We briefly review the model and illustrate the functions throughout a simulated dataset.
Mean-parametrized Conway-Maxwell-Poisson regression models for dispersed counts
Conway-Maxwell-Poisson (CMP) distributions are flexible generalizations of
the Poisson distribution for modelling overdispersed or underdispersed counts.
The main hindrance to their wider use in practice seems to be the inability to
directly model the mean of counts, making them not compatible with nor
comparable to competing count regression models, such as the log-linear
Poisson, negative-binomial or generalized Poisson regression models. This note
illustrates how CMP distributions can be parametrized via the mean, so that
simpler and more easily-interpretable mean-models can be used, such as a
log-linear model. Other link functions are also available, of course. In
addition to establishing attractive theoretical and asymptotic properties of
the proposed model, its good finite-sample performance is exhibited through
various examples and a simulation study based on real datasets. Moreover, the
MATLAB routine to fit the model to data is demonstrated to be up to an order of
magnitude faster than the current software to fit standard CMP models, and over
two orders of magnitude faster than the recently proposed hyper-Poisson model.Comment: To appear in Statistical Modelling: An International Journa
Recommended from our members
llc: a collection of R functions for fitting a class of Lee-Carter mortality models using iterative fitting algorithms
We implement a specialised iterative regression methodology in R for the analysis of age-period mortality data based on a class of generalised Lee-Carter (LC) type modelling structures. The LC-based modelling frameworks is viewed in the current literature as among the most efficient and transparent methods of modelling and projecting mortality improvements. Thus, we make use of the modelling approach discussed in Renshaw and Haberman (2006), which extends the basic LC model and proposes to make use of a tailored iterative process to generate parameter estimates based on Poisson likelihood. Furthermore, building on this methodology we develop and implement a stratified LC model for the measurement of the additive effect on the log scale of an explanatory factor (other than age and time). This modelling methodology is implemented in a publically available collection of programming functions that facilitate both the preparation of mortality data and the fitting and analysis of the given log-linear modelling structures. Also, the package incorporates methods to produce forecasts of future mortality rates and to compute the corresponding future life expectancy
Compositional data for global monitoring: the case of drinking water and sanitation
Introduction
At a global level, access to safe drinking water and sanitation has been monitored by the Joint Monitoring Programme (JMP) of WHO and UNICEF. The methods employed are based on analysis of data from household surveys and linear regression modelling of these results over time. However, there is evidence of non-linearity in the JMP data. In addition, the compositional nature of these data is not taken into consideration. This article seeks to address these two previous shortcomings in order to produce more accurate estimates.
Methods
We employed an isometric log-ratio transformation designed for compositional data. We applied linear and non-linear time regressions to both the original and the transformed data. Specifically, different modelling alternatives for non-linear trajectories were analysed, all of which are based on a generalized additive model (GAM).
Results and discussion
Non-linear methods, such as GAM, may be used for modelling non-linear trajectories in the JMP data. This projection method is particularly suited for data-rich countries. Moreover, the ilr transformation of compositional data is conceptually sound and fairly simple to implement. It helps improve the performance of both linear and non-linear regression models, specifically in the occurrence of extreme data points, i.e. when coverage rates are near either 0% or 100%.Peer ReviewedPostprint (author's final draft
Comparison between quantile regression technique and generalised additive model for regional flood frequency analysis : a case study for Victoria, Australia
For design flood estimation in ungauged catchments, Regional Flood Frequency Analysis (RFFA) is commonly used. Most of the RFFA methods are primarily based on linear modelling approaches, which do not account for the inherent nonlinearity of rainfall-runoff processes. Using data from 114 catchments in Victoria, Australia, this study employs the Generalised Additive Model (GAM) in RFFA and compares the results with linear method known as Quantile Regression Technique (QRT). The GAM model performance is found to be better for smaller return periods (i.e., 2, 5 and 10 years) with a median relative error ranging 16–41%. For higher return periods (i.e., 20, 50 and 100 years), log-log linear regression model (QRT) outperforms the GAM model with a median relative error ranging 31–59%
Recommended from our members
Automated General-to-Specific (GETS) regression modeling and indicator saturation methods for the detection of outliers and structural breaks
This paper provides an overview of the R package gets, which contains facilities for automated General-to-Specific (GETS) modelling of the mean and variance of a regression, and Indicator Saturation (IS) methods for the detection and modelling of outliers and structural breaks. The mean can be specified as an autoregressive model with covariates (an ‘AR-X’ model), and the variance can be specified as an autoregressive log-variance model with covariates (a ‘log-ARCH-X’ model). The covariates in the two specifications need not be the same, and the classical linear regression model is obtained as a special case when there is no dynamics, and when there are no covariates in the variance equation. The four main functions of the package are arx, getsm, getsv and isat. The first function estimates an AR-X model with log-ARCH-X errors. The second function undertakes GETS modelling of the mean specification of an arx object. The third function undertakes GETS modelling of the log-variance specification of an arx object. The fourth function undertakes GETS modelling of an indicator-saturated mean specification allowing for the detection of outliers and structural breaks. The usage of two convenience functions for export of results to EViews and STATA are illustrated, and LATEXcode of the estimation output can readily be generated
Analyzing Temperature Effects on Mortality Within the R Environment: The Constrained Segmented Distributed Lag Parameterization
Here we present and discuss the R package modTempEff including a set of functions aimed at modelling temperature effects on mortality with time series data. The functions fit a particular log linear model which allows to capture the two main features of mortality- temperature relationships: nonlinearity and distributed lag effect. Penalized splines and segmented regression constitute the core of the modelling framework. We briefly review the model and illustrate the functions throughout a simulated dataset
Benutzerdefinierte Design-Matrizen in log-linearen Analysen: Realisierungsmöglichkeiten in den SPSS-Prozeduren GENLOG und LOGLINEAR
'Der Anwendung log-linearer Modelle in der Sozialforschung steht oft die Vorstellung entgegen, daß diese Modelle recht kompliziert und daher kaum zu interpretieren seien. Das Verständnis für log-lineare Analysen wird erleichtert, wenn die Verwandtschaft zur multiplen Regression mit nominalskalierten Prädikaten gesehen wird. Gleichzeitig kann so auch die Bedeutung der sogenannten Design-Matrix nahegebracht werden. Die volle Flexibilität log-linearer Modelle wird nämlich erst durch die Formulierung benutzerdefinierter Design-Matritzen erreicht. Anhand von Beispieldaten aus dem ALLBUS 1996 wird gezeigt, wie sich bei Anwendung der SPSS-Prozeduren GENLOG oder LOGLINEAR loglineare Analysen mit benutzerdefinierten Design-Matritzen realisieren lassen.' (Autorenreferat)'Applications of long-linear modelling are sometimes prevented by the impression that this technique is not user-friendly. Nevertheless, log-linear modelling is nothing more than multiple regression of the logarithms of cell counts on categorical predictors. Within this view the importance of the design matrix is easy to understand. The specification of user-defined design matrices within log-linear models allows for very flexible analyses of categorical data. It is shown how such analyses can be done using the SPSS procedures GENLOG or LOGLINEAR. An empirical example is given based on data from the ALLBUS 1996.' (author's abstract)
- …