This thesis first maps
the relational computation onto Graphics Processing Units (GPU)s by designing a
series of tools and then
explores the different opportunities of reducing the limitation brought by the
memory hierarchy across the CPU and GPU system.
First, a complete end-to-end compiler and runtime infrastructure, Red Fox, is proposed. The
evaluation on the full set of
industry standard TPC-H queries on a single node GPU
shows on average Red Fox is 11.20x faster compared with a commercial database system on a state
of art CPU machine.
Second, a new compiler technique called kernel fusion is designed to fuse the code bodies of several
relational operators to reduce data movement. Third, a multi-predicate join algorithm is
designed for GPUs which can provide much better performance and be used with
more flexibility compared with kernel fusion.
Fourth, the GPU optimized multi-predicate join is integrated into a
multi-threaded CPU database runtime system that supports out-of-core
data set to solve real world problem.
This thesis presents key insights, lessons learned, measurements from the
implementations, and opportunities for further improvements.Ph.D