A note on Mahalanobis and related distance measures in WinISI and The Unscrambler

Abstract

In identifying spectral outliers in near infrared calibration it is common to use a distance measure that is related to Mahalanobis distance. However, different software packages tend to use different variants, which lead to a translation problem if more than one package is used. Here the relationships between squared Mahalanobis distance D2, the GH distance of WinISI, and the T2 and leverage (L) statistics of The Unscrambler are established as D2 = T2 ≈ L × n ≈ GH × k, where n and k are the numbers of samples and variables, respectively, in the set of spectral data used to establish the distance measure. The implications for setting thresholds for outlier detection are discussed. On the way to this result the principal component scores from WinISI and The Unscrambler are compared. Both packages scale the scores for a component to have variances proportional to the contribution of that component to total variance, but the WinISI scores, unlike those from The Unscrambler, do not have mean zero

    Similar works