We explore how the big-three computing paradigms -- symmetric multi-processor
(SMC), graphical processing units (GPUs), and cluster computing -- can together
be brought to bare on large-data Gaussian processes (GP) regression problems
via a careful implementation of a newly developed local approximation scheme.
Our methodological contribution focuses primarily on GPU computation, as this
requires the most care and also provides the largest performance boost.
However, in our empirical work we study the relative merits of all three
paradigms to determine how best to combine them. The paper concludes with two
case studies. One is a real data fluid-dynamics computer experiment which
benefits from the local nature of our approximation; the second is a synthetic
data example designed to find the largest design for which (accurate) GP
emulation can performed on a commensurate predictive set under an hour.Comment: 24 pages, 6 figures, 1 tabl