Markov Chain Monte Carlo (MCMC) is a well-established family of algorithms
primarily used in Bayesian statistics to sample from a target distribution when
direct sampling is challenging. Existing work on Bayesian decision trees uses
MCMC. Unfortunately, this can be slow, especially when considering large
volumes of data. It is hard to parallelise the accept-reject component of the
MCMC. None-the-less, we propose two methods for exploiting parallelism in the
MCMC: in the first, we replace the MCMC with another numerical Bayesian
approach, the Sequential Monte Carlo (SMC) sampler, which has the appealing
property that it is an inherently parallel algorithm; in the second, we
consider data partitioning. Both methods use multi-core processing with a
HighPerformance Computing (HPC) resource. We test the two methods in various
study settings to determine which method is the most beneficial for each test
case. Experiments show that data partitioning has limited utility in the
settings we consider and that the use of the SMC sampler can improve run-time
(compared to the sequential implementation) by up to a factor of 343