232,094 research outputs found
A generalized linear mixed model for longitudinal binary data with a marginal logit link function
Longitudinal studies of a binary outcome are common in the health, social,
and behavioral sciences. In general, a feature of random effects logistic
regression models for longitudinal binary data is that the marginal functional
form, when integrated over the distribution of the random effects, is no longer
of logistic form. Recently, Wang and Louis [Biometrika 90 (2003) 765--775]
proposed a random intercept model in the clustered binary data setting where
the marginal model has a logistic form. An acknowledged limitation of their
model is that it allows only a single random effect that varies from cluster to
cluster. In this paper we propose a modification of their model to handle
longitudinal data, allowing separate, but correlated, random intercepts at each
measurement occasion. The proposed model allows for a flexible correlation
structure among the random intercepts, where the correlations can be
interpreted in terms of Kendall's . For example, the marginal
correlations among the repeated binary outcomes can decline with increasing
time separation, while the model retains the property of having matching
conditional and marginal logit link functions. Finally, the proposed method is
used to analyze data from a longitudinal study designed to monitor cardiac
abnormalities in children born to HIV-infected women.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS390 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On the correspondence from Bayesian log-linear modelling to logistic regression modelling with -priors
Consider a set of categorical variables where at least one of them is binary.
The log-linear model that describes the counts in the resulting contingency
table implies a specific logistic regression model, with the binary variable as
the outcome. Within the Bayesian framework, the -prior and mixtures of
-priors are commonly assigned to the parameters of a generalized linear
model. We prove that assigning a -prior (or a mixture of -priors) to the
parameters of a certain log-linear model designates a -prior (or a mixture
of -priors) on the parameters of the corresponding logistic regression. By
deriving an asymptotic result, and with numerical illustrations, we demonstrate
that when a -prior is adopted, this correspondence extends to the posterior
distribution of the model parameters. Thus, it is valid to translate inferences
from fitting a log-linear model to inferences within the logistic regression
framework, with regard to the presence of main effects and interaction terms.Comment: 27 page
A Fused Elastic Net Logistic Regression Model for Multi-Task Binary Classification
Multi-task learning has shown to significantly enhance the performance of
multiple related learning tasks in a variety of situations. We present the
fused logistic regression, a sparse multi-task learning approach for binary
classification. Specifically, we introduce sparsity inducing penalties over
parameter differences of related logistic regression models to encode
similarity across related tasks. The resulting joint learning task is cast into
a form that lends itself to be efficiently optimized with a recursive variant
of the alternating direction method of multipliers. We show results on
synthetic data and describe the regime of settings where our multi-task
approach achieves significant improvements over the single task learning
approach and discuss the implications on applying the fused logistic regression
in different real world settings.Comment: 17 page
Collinearity diagnostics of binary logistic regression model
Multicollinearity is a statistical phenomenon in which predictor variables in a logistic regression model are highly correlated. It is not uncommon when there are a large number of covariates in the model. Multicollinearity has been the thousand pounds monster in statistical modeling. Taming this monster has proven to be one of the great challenges of statistical modeling research. Multicollinearity can cause unstable estimates and inaccurate variances which affects confidence intervals and hypothesis tests. The existence of collinearity inflates the variances of the parameter estimates, and consequently incorrect inferences about relationships between explanatory and response variables. Examining the correlation matrix may be helpful to detect multicollinearity but not sufficient. Much better diagnostics are produced by linear regressionwith the option tolerance, Vif, condition indices and variance proportions. For moderate to large sample sizes, the approach to drop one of the correlated variables was established entirely satisfactory to reduce multicollinearity. On the light of different collinearity diagnostics, we may safely conclude that without increasing sample size, the second choice to omit one of the correlated variables can reduce multicollinearity to a great extent
Climate change and adaptation of small-scale cattle and sheep farmers
The main objective of this study was to investigate the factors that affected the decision of small-scale farmers who kept cattle and sheep on whether to adapt or not to climate changes. The Binary Logistic Regression model was used to investigate farmers’ decision. The results implied that a large number of socio-economic variables affected the decision of farmers on adaptation to climate changes. The study concluded that the most significant factors affecting climate change and adaptation were non-farm income, type of weather perceived, livestock ownership, distance to weather stations, distance to input markets, adaptation choices and annual average temperature.Climate change, small-scale cattle and sheep farming, Binary logistic model, Farm Management,
Generalized Extreme Value Regression for Binary Rare Events Data: an Application to Credit Defaults
The most used regression model with binary dependent variable is the logistic regression model. When the dependent variable represents a rare event, the logistic regression model shows relevant drawbacks. In order to overcome these drawbacks we propose the Generalized Extreme Value (GEV) regression model. In particular, in a Generalized Linear Model (GLM) with binary dependent variable we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. The estimation procedure is the maximum likelihood method. This model accommodates skewness and it presents a generalization of GLMs with log-log link function. In credit risk analysis a pivotal topic is the default probability estimation. Since defaults are rare events, we apply the GEV regression to empirical data on Italian Small and Medium Enterprises (SMEs) to model their default probabilities.
- …