research

The detection of influential subsets in linear regression using an influence matrix.

Abstract

This paper presents a new method to identify influential subsets in linear regression problems. The procedure uses the eigenstructure of an influence matrix which is defined as the matrix of uncentered covariance of the effect on the whole data set of deleting each observation, normalized to include the univariate Cook's statistics in the diagonal. It is shown that points in an influential subset will appear with large weight in at least one of the eigenvector linked to the largest eigenvalues in this influence matrix. The method is illustrated with several well-known examples in the literature, and in all of them it succeeds in identifying the relevant influential subsets.Eigenvectors; Masking; Multivariate Influence; Outliers;

    Similar works