Compositional loess modeling

Abstract

Cleveland (1979) is usually credited with the introduction of the locally weighted regression, Loess. The concept was further developed by Cleveland and Devlin (1988). The general idea is that for an arbitrary number of explanatory data points xi the value of a dependent variable is estimated ^yi. The ^yi is the tted value from a dth degree polynomial in xi. (In practice often d = 1.) The ^yi is tted using weighted least squares, WLS, where the points xk (k = 1; : : : ; n) closest to xi are given the largest weights. We de ne a weighted least squares estimation for compositional data, C-WLS. In WLS the sum of the weighted squared Euclidean distances between the observed and the estimated values is minimized. In C-WLS we minimize the weighted sum of the squared simplicial distances (Aitchison, 1986, p. 193) between the observed compositions and their estimates. We then de ne a compositional locally weighted regression, C-Loess. Here a composition is assumed to be explained by a real valued (multivariate) variable. For an arbitrary number of data points xi we for each xi t a dth degree polynomial in xi yielding an estimate ^yi of the composition yi. We use C-WLS to t the polynomial giving the largest weights to the points xk (k = 1; : : : ; n) closest to xi. Finally the C-Loess is applied to Swedish opinion poll data to create a poll-of-polls time series. The results are compared to previous results not acknowledging the compositional structure of the data

    Similar works