232,094 research outputs found

    A generalized linear mixed model for longitudinal binary data with a marginal logit link function

    Get PDF
    Longitudinal studies of a binary outcome are common in the health, social, and behavioral sciences. In general, a feature of random effects logistic regression models for longitudinal binary data is that the marginal functional form, when integrated over the distribution of the random effects, is no longer of logistic form. Recently, Wang and Louis [Biometrika 90 (2003) 765--775] proposed a random intercept model in the clustered binary data setting where the marginal model has a logistic form. An acknowledged limitation of their model is that it allows only a single random effect that varies from cluster to cluster. In this paper we propose a modification of their model to handle longitudinal data, allowing separate, but correlated, random intercepts at each measurement occasion. The proposed model allows for a flexible correlation structure among the random intercepts, where the correlations can be interpreted in terms of Kendall's τ\tau. For example, the marginal correlations among the repeated binary outcomes can decline with increasing time separation, while the model retains the property of having matching conditional and marginal logit link functions. Finally, the proposed method is used to analyze data from a longitudinal study designed to monitor cardiac abnormalities in children born to HIV-infected women.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS390 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the correspondence from Bayesian log-linear modelling to logistic regression modelling with gg-priors

    Get PDF
    Consider a set of categorical variables where at least one of them is binary. The log-linear model that describes the counts in the resulting contingency table implies a specific logistic regression model, with the binary variable as the outcome. Within the Bayesian framework, the gg-prior and mixtures of gg-priors are commonly assigned to the parameters of a generalized linear model. We prove that assigning a gg-prior (or a mixture of gg-priors) to the parameters of a certain log-linear model designates a gg-prior (or a mixture of gg-priors) on the parameters of the corresponding logistic regression. By deriving an asymptotic result, and with numerical illustrations, we demonstrate that when a gg-prior is adopted, this correspondence extends to the posterior distribution of the model parameters. Thus, it is valid to translate inferences from fitting a log-linear model to inferences within the logistic regression framework, with regard to the presence of main effects and interaction terms.Comment: 27 page

    A Fused Elastic Net Logistic Regression Model for Multi-Task Binary Classification

    Full text link
    Multi-task learning has shown to significantly enhance the performance of multiple related learning tasks in a variety of situations. We present the fused logistic regression, a sparse multi-task learning approach for binary classification. Specifically, we introduce sparsity inducing penalties over parameter differences of related logistic regression models to encode similarity across related tasks. The resulting joint learning task is cast into a form that lends itself to be efficiently optimized with a recursive variant of the alternating direction method of multipliers. We show results on synthetic data and describe the regime of settings where our multi-task approach achieves significant improvements over the single task learning approach and discuss the implications on applying the fused logistic regression in different real world settings.Comment: 17 page

    Collinearity diagnostics of binary logistic regression model

    Get PDF
    Multicollinearity is a statistical phenomenon in which predictor variables in a logistic regression model are highly correlated. It is not uncommon when there are a large number of covariates in the model. Multicollinearity has been the thousand pounds monster in statistical modeling. Taming this monster has proven to be one of the great challenges of statistical modeling research. Multicollinearity can cause unstable estimates and inaccurate variances which affects confidence intervals and hypothesis tests. The existence of collinearity inflates the variances of the parameter estimates, and consequently incorrect inferences about relationships between explanatory and response variables. Examining the correlation matrix may be helpful to detect multicollinearity but not sufficient. Much better diagnostics are produced by linear regressionwith the option tolerance, Vif, condition indices and variance proportions. For moderate to large sample sizes, the approach to drop one of the correlated variables was established entirely satisfactory to reduce multicollinearity. On the light of different collinearity diagnostics, we may safely conclude that without increasing sample size, the second choice to omit one of the correlated variables can reduce multicollinearity to a great extent

    Climate change and adaptation of small-scale cattle and sheep farmers

    Get PDF
    The main objective of this study was to investigate the factors that affected the decision of small-scale farmers who kept cattle and sheep on whether to adapt or not to climate changes. The Binary Logistic Regression model was used to investigate farmers’ decision. The results implied that a large number of socio-economic variables affected the decision of farmers on adaptation to climate changes. The study concluded that the most significant factors affecting climate change and adaptation were non-farm income, type of weather perceived, livestock ownership, distance to weather stations, distance to input markets, adaptation choices and annual average temperature.Climate change, small-scale cattle and sheep farming, Binary logistic model, Farm Management,

    Generalized Extreme Value Regression for Binary Rare Events Data: an Application to Credit Defaults

    Get PDF
    The most used regression model with binary dependent variable is the logistic regression model. When the dependent variable represents a rare event, the logistic regression model shows relevant drawbacks. In order to overcome these drawbacks we propose the Generalized Extreme Value (GEV) regression model. In particular, in a Generalized Linear Model (GLM) with binary dependent variable we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. The estimation procedure is the maximum likelihood method. This model accommodates skewness and it presents a generalization of GLMs with log-log link function. In credit risk analysis a pivotal topic is the default probability estimation. Since defaults are rare events, we apply the GEV regression to empirical data on Italian Small and Medium Enterprises (SMEs) to model their default probabilities.
    corecore