41,329 research outputs found
Regularization for Generalized Additive Mixed Models by Likelihood-Based Boosting
With the emergence of semi- and nonparametric regression the
generalized linear mixed model has been expanded to account for additive predictors. In the present paper an approach to variable selection is proposed that works for generalized additive mixed models. In contrast to common procedures it can be used in high-dimensional settings where many covariates are available and the form of the influence is unknown. It is constructed as a componentwise boosting method and hence is able to perform variable selection. The complexity of the resulting estimator is determined by information criteria. The method is nvestigated in simulation studies for binary and Poisson responses and is illustrated by using real data sets
Regularization for Generalized Additive Mixed Models by Likelihood-Based Boosting
With the emergence of semi- and nonparametric regression the
generalized linear mixed model has been expanded to account for additive predictors. In the present paper an approach to variable selection is proposed that works for generalized additive mixed models. In contrast to common procedures it can be used in high-dimensional settings where many covariates are available and the form of the influence is unknown. It is constructed as a componentwise boosting method and hence is able to perform variable selection. The complexity of the resulting estimator is determined by information criteria. The method is nvestigated in simulation studies for binary and Poisson responses and is illustrated by using real data sets
Predicting time to graduation at a large enrollment American university
The time it takes a student to graduate with a university degree is mitigated
by a variety of factors such as their background, the academic performance at
university, and their integration into the social communities of the university
they attend. Different universities have different populations, student
services, instruction styles, and degree programs, however, they all collect
institutional data. This study presents data for 160,933 students attending a
large American research university. The data includes performance, enrollment,
demographics, and preparation features. Discrete time hazard models for the
time-to-graduation are presented in the context of Tinto's Theory of Drop Out.
Additionally, a novel machine learning method: gradient boosted trees, is
applied and compared to the typical maximum likelihood method. We demonstrate
that enrollment factors (such as changing a major) lead to greater increases in
model predictive performance of when a student graduates than performance
factors (such as grades) or preparation (such as high school GPA).Comment: 28 pages, 11 figure
Variable Selection for Generalized Linear Mixed Models by L1-Penalized Estimation
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed
models includes an L1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized loglikelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of otentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets
- ā¦