2,123 research outputs found
Estimation and Regularization Techniques for Regression Models with Multidimensional Prediction Functions
Boosting is one of the most important methods for fitting
regression models and building prediction rules from
high-dimensional data. A notable feature of boosting is that the
technique has a built-in mechanism for shrinking coefficient
estimates and variable selection. This regularization mechanism
makes boosting a suitable method for analyzing data characterized by
small sample sizes and large numbers of predictors. We extend the
existing methodology by developing a boosting method for prediction
functions with multiple components. Such multidimensional functions
occur in many types of statistical models, for example in count data
models and in models involving outcome variables with a mixture
distribution. As will be demonstrated, the new algorithm is suitable
for both the estimation of the prediction function and
regularization of the estimates. In addition, nuisance parameters
can be estimated simultaneously with the prediction function
The mortality of the Italian population: Smoothing techniques on the Lee--Carter model
Several approaches have been developed for forecasting mortality using the
stochastic model. In particular, the Lee-Carter model has become widely used
and there have been various extensions and modifications proposed to attain a
broader interpretation and to capture the main features of the dynamics of the
mortality intensity. Hyndman-Ullah show a particular version of the Lee-Carter
methodology, the so-called Functional Demographic Model, which is one of the
most accurate approaches as regards some mortality data, particularly for
longer forecast horizons where the benefit of a damped trend forecast is
greater. The paper objective is properly to single out the most suitable model
between the basic Lee-Carter and the Functional Demographic Model to the
Italian mortality data. A comparative assessment is made and the empirical
results are presented using a range of graphical analyses.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS394 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Variable Selection for Generalized Linear Mixed Models by L1-Penalized Estimation
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed
models includes an L1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized loglikelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of otentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets
Variable Selection for Generalized Linear Mixed Models by L1-Penalized Estimation
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed
models includes an L1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized loglikelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of otentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
General Semiparametric Shared Frailty Model Estimation and Simulation with frailtySurv
The R package frailtySurv for simulating and fitting semi-parametric shared
frailty models is introduced. Package frailtySurv implements semi-parametric
consistent estimators for a variety of frailty distributions, including gamma,
log-normal, inverse Gaussian and power variance function, and provides
consistent estimators of the standard errors of the parameters' estimators. The
parameters' estimators are asymptotically normally distributed, and therefore
statistical inference based on the results of this package, such as hypothesis
testing and confidence intervals, can be performed using the normal
distribution. Extensive simulations demonstrate the flexibility and correct
implementation of the estimator. Two case studies performed with publicly
available datasets demonstrate applicability of the package. In the Diabetic
Retinopathy Study, the onset of blindness is clustered by patient, and in a
large hard drive failure dataset, failure times are thought to be clustered by
the hard drive manufacturer and model
General Design Bayesian Generalized Linear Mixed Models
Linear mixed models are able to handle an extraordinary range of
complications in regression-type analyses. Their most common use is to account
for within-subject correlation in longitudinal data analysis. They are also the
standard vehicle for smoothing spatial count data. However, when treated in
full generality, mixed models can also handle spline-type smoothing and closely
approximate kriging. This allows for nonparametric regression models (e.g.,
additive models and varying coefficient models) to be handled within the mixed
model framework. The key is to allow the random effects design matrix to have
general structure; hence our label general design. For continuous response
data, particularly when Gaussianity of the response is reasonably assumed,
computation is now quite mature and supported by the R, SAS and S-PLUS
packages. Such is not the case for binary and count responses, where
generalized linear mixed models (GLMMs) are required, but are hindered by the
presence of intractable multivariate integrals. Software known to us supports
special cases of the GLMM (e.g., PROC NLMIXED in SAS or glmmML in R) or relies
on the sometimes crude Laplace-type approximation of integrals (e.g., the SAS
macro glimmix or glmmPQL in R). This paper describes the fitting of general
design generalized linear mixed models. A Bayesian approach is taken and Markov
chain Monte Carlo (MCMC) is used for estimation and inference. In this
generalized setting, MCMC requires sampling from nonstandard distributions. In
this article, we demonstrate that the MCMC package WinBUGS facilitates sound
fitting of general design Bayesian generalized linear mixed models in practice.Comment: Published at http://dx.doi.org/10.1214/088342306000000015 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …