Bayesian classification and regression with high order interactions is
largely infeasible because Markov chain Monte Carlo (MCMC) would need to be
applied with a great many parameters, whose number increases rapidly with the
order. In this paper we show how to make it feasible by effectively reducing
the number of parameters, exploiting the fact that many interactions have the
same values for all training cases. Our method uses a single ``compressed''
parameter to represent the sum of all parameters associated with a set of
patterns that have the same value for all training cases. Using symmetric
stable distributions as the priors of the original parameters, we can easily
find the priors of these compressed parameters. We therefore need to deal only
with a much smaller number of compressed parameters when training the model
with MCMC. The number of compressed parameters may have converged before
considering the highest possible order. After training the model, we can split
these compressed parameters into the original ones as needed to make
predictions for test cases. We show in detail how to compress parameters for
logistic sequence prediction models. Experiments on both simulated and real
data demonstrate that a huge number of parameters can indeed be reduced by our
compression method.Comment: 29 page