Search CORE

76,399 research outputs found

Robust Modeling Using Non-Elliptically Contoured Multivariate t Distributions

Author: Ding Peng
Jiang Zhichao
Publication venue
Publication date: 01/01/2016
Field of study

Models based on multivariate t distributions are widely applied to analyze data with heavy tails. However, all the marginal distributions of the multivariate t distributions are restricted to have the same degrees of freedom, making these models unable to describe different marginal heavy-tailedness. We generalize the traditional multivariate t distributions to non-elliptically contoured multivariate t distributions, allowing for different marginal degrees of freedom. We apply the non-elliptically contoured multivariate t distributions to three widely-used models: the Heckman selection model with different degrees of freedom for selection and outcome equations, the multivariate Robit model with different degrees of freedom for marginal responses, and the linear mixed-effects model with different degrees of freedom for random effects and within-subject errors. Based on the Normal mixture representation of our t distribution, we propose efficient Bayesian inferential procedures for the model parameters based on data augmentation and parameter expansion. We show via simulation studies and real examples that the conclusions are sensitive to the existence of different marginal heavy-tailedness

arXiv.org e-Print Archive

S-estimation and a robust conditional Akaike information criterion for linear mixed models.

Author: Claeskens Gerda
Tharmaratnam Kukatharmini
Publication venue
Publication date
Field of study

We study estimation and model selection on both the fixed and the random effects in the setting of linear mixed models using outlier robust S-estimators. Robustness aspects on the level of the random effects as well as on the error terms is taken into account. The derived marginal and conditional information criteria are in the style of Akaike's information criterion but avoid the use of a fully specified likelihood by a suitable S-estimation approach that minimizes a scale function. We derive the appropriate penalty terms and provide an implementation using R. The setting of semiparametric additive models fit with penalized regression splines, in a mixed models formulation, fits as a specific application. Simulated data examples illustrate the effectiveness of the proposed criteria.Akaike information criterion; Conditional likelihood; Effective degrees of freedom; Mixed model; Penalized regression spline; S-estimation;

Research Papers in Economics

A Framework for Unbiased Model Selection Based on Boosting

Author: Hofner Benjamin
Hothorn Torsten
Kneib Thomas
Schmid Matthias
Publication venue
Publication date: 10/12/2009
Field of study

Variable selection and model choice are of major concern in many statistical applications, especially in high-dimensional regression models. Boosting is a convenient statistical method that combines model fitting with intrinsic model selection. We investigate the impact of base-learner specification on the performance of boosting as a model selection procedure. We show that variable selection may be biased if the covariates are of different nature. Important examples are models combining continuous and categorical covariates, especially if the number of categories is large. In this case, least squares base-learners offer increased flexibility for the categorical covariate and lead to a preference even if the categorical covariate is non-informative. Similar difficulties arise when comparing linear and nonlinear base-learners for a continuous covariate. The additional flexibility in the nonlinear base-learner again yields a preference of the more complex modeling alternative. We investigate these problems from a theoretical perspective and suggest a framework for unbiased model selection based on a general class of penalized least squares base-learners. Making all base-learners comparable in terms of their degrees of freedom strongly reduces the selection bias observed in naive boosting specifications. The importance of unbiased model selection is demonstrated in simulations and an application to forest health models

Open Access LMU

Penalized Likelihood and Bayesian Function Selection in Regression Models

Author: Fahrmeir Ludwig
Kneib Thomas
Scheipl Fabian
Publication venue
Publication date: 04/03/2013
Field of study

Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice

arXiv.org e-Print Archive

CiteSeerX

Variable Selection and Model Choice in Geoadditive Regression Models

Author: Hothorn Torsten
Kneib Thomas
Tutz Gerhard
Publication venue
Publication date: 01/01/2007
Field of study

Model choice and variable selection are issues of major concern in practical regression analyses. We propose a boosting procedure that facilitates both tasks in a class of complex geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, random effects, and varying coefficient terms. The major modelling component are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a remaining smooth component with one degree of freedom to obtain a fair comparison between all model terms. A generic representation of the geoadditive model allows to devise a general boosting algorithm that implements automatic model choice and variable selection. We demonstrate the versatility of our approach with two examples: a geoadditive Poisson regression model for species counts in habitat suitability analyses and a geoadditive logit model for the analysis of forest health

Open Access LMU