In this thesis, I apply techniques from the field of computer graphics to ray tracing in
astrophysical simulations, and introduce the grace software library. This is combined
with an extant radiative transfer solver to produce a new package, taranis. It allows
for fully-parallel particle updates via per-particle accumulation of rates, followed by a
forward Euler integration step, and is manifestly photon-conserving. To my knowledge,
taranis is the first ray-traced radiative transfer code to run on graphics processing
units and target cosmological-scale smooth particle hydrodynamics (SPH) datasets.
A significant optimization effort is undertaken in developing grace. Contrary to
typical results in computer graphics, it is found that the bounding volume hierarchies
(BVHs) used to accelerate the ray tracing procedure need not be of high quality; as a
result, extremely fast BVH construction times are possible (< 0.02 microseconds per
particle in an SPH dataset). I show that this exceeds the performance researchers might
expect from CPU codes by at least an order of magnitude, and compares favourably
to a state-of-the-art ray tracing solution. Similar results are found for the ray-tracing
itself, where again techniques from computer graphics are examined for effectiveness
with SPH datasets, and new optimizations proposed. For high per-source ray counts
(≳ 104), grace can reduce ray tracing run times by up to two orders of magnitude
compared to extant CPU solutions developed within the astrophysics community, and
by a factor of a few compared to a state-of-the-art solution.
taranis is shown to produce expected results in a suite of de facto cosmological
radiative transfer tests cases. For some cases, it currently out-performs a serial, CPU-based
alternative by a factor of a few. Unfortunately, for the most realistic test its
performance is extremely poor, making the current taranis code unsuitable for cosmological
radiative transfer. The primary reason for this failing is found to be a small
minority of particles which always dominate the timestep criteria. Several plausible
routes to mitigate this problem, while retaining parallelism, are put forward