Most approaches in forecasting merely try to predict the next value of the time series.
In contrast, this paper presents a framework to predict the full probability distribution. It
is expressed as a mixture model: the dynamics of the individual states is modeled with so-called
"experts" (potentially nonlinear neural networks), and the dynamics between the states is modeled
using a hidden Markov approach. The full density predictions are obtained by a weighted superposition
of the individual densities of each expert. This model class is called "hidden Markov experts".
Results are presented for daily S&P500 data. While the predictive accuracy of the mean does
not improve over simpler models, evaluating the prediction of the full density shows a clear out-of-sample
improvement both over a simple GARCH(1,l) model (which assumes Gaussian distributed
returns) and over a "gated experts" model (which expresses the weighting for each state non-recursively
as a function of external inputs). Several interpretations are given: the blending of
supervised and unsupervised learning, the discovery of hidden states, the combination of forecasts,
the specialization of experts, the removal of outliers, and the persistence of volatility.Information Systems Working Papers Serie