Clustering Covariates Regression

Abstract

Linear regression is a much applied technique in many research fields. Its aim is to predict one or more dependent variables on the basis of a number of independent variables. However, when analyzing data sets with very many independent variables, some of which are highly correlated, one may face the bouncing beta problem: Regression weights obtained for such data sets tend to be unstable, in that small changes in the data can lead to completely different regression weights. To solve the bouncing beta problem, many solutions have already been suggested. Roughly, two types of solutions can be distinguished: variable selection methods (e.g. Oscar and the Lasso; Bondel & Reich, 2008; Tibshirani,1996) and dimension reduction methods (e.g. principal component regression and principal covariates regression; Kiers & Smilde, 2007). However, the interpretation of the solutions obtained by these methods is not always straightforward. As a possible alternative, we therefore propose the Clustering Covariates Regression method (CCovR). This method simultaneously partitions the independent variables into a few predictor types and regresses the dependent variable(s) on these types. In this talk, we first introduce the CCovR method. Next, we compare CCovR and some variable selection and dimension reduction methods by applying them to the same data set.status: publishe

    Similar works

    Full text

    thumbnail-image

    Available Versions