Fast sorting for exact OIT of complex scenes

Abstract

Exact order-independent transparency (OIT) techniques capture all fragments during rasterization. The fragments are then sorted per-pixel by depth and composited in order using alpha transparency. The sorting stage is a bottleneck for high depth complexity scenes, taking 70-95 % of the total time for those investigated. In this paper, we show that typical shader-based sorting speed is impacted by local memory latency and occupancy. We present and discuss the use of both registers and an external merge sort in register-based block sort to better use the memory hierarchy of the GPU for improved OIT rendering performance. This approach builds upon backwards memory allocation, achieving an OIT rendering speed up to 1.7 × that of the best previous method and 6.3 × that of the common straight forward OIT implementation. In some cases, the sorting stage is reduced to no longer be the dominant OIT component

Similar works

Full text

thumbnail-image

Research Repository RMIT University

redirect
Last time updated on 04/05/2016

This paper was published in Research Repository RMIT University.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.