We consider the problem of learning linear prediction models with model
misspecification bias. In such case, the collinearity among input variables may
inflate the error of parameter estimation, resulting in instability of
prediction results when training and test distributions do not match. In this
paper we theoretically analyze this fundamental problem and propose a sample
reweighting method that reduces collinearity among input variables. Our method
can be seen as a pretreatment of data to improve the condition of design
matrix, and it can then be combined with any standard learning method for
parameter estimation and variable selection. Empirical studies on both
simulation and real datasets demonstrate the effectiveness of our method in
terms of more stable performance across different distributed data.Comment: Accepted as poster paper at AAAI202