Achieving high performance on modern CPUs requires efficient utilization of SIMD units. Doing so requires that algorithms are able to take full advantage of the SIMD width offered and to not waste SIMD instructions on low utilization cases. Ray tracers exploit SIMD extensions through packet tracing. This re-casts the ray tracing algorithm into a SIMD framework, but high SIMD efficiency is only achieved for moderately complex scenes, and highly coherent packets. In this paper, we present a stream programming oriented traversal algorithm that processes streams of rays in SIMD fashion; the algorithm is motivated by breadth-first ray traversal and implicitly re-orders streams of rays on the fly by removing deactivated rays after each traversal step using a stream compaction step. This improves SIMD efficiency in the presence of complex scenes and diverging packets, and is, in particular, designed for potential wider-than-four SIMD architectures with scatter/gather support
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.