We present Cube-4, a special-purpose volume rendering architecture
that is capable of rendering high-resolution (e.g., 1024^3)
datasets at 30 frames per second. The underlying algorithm, called
slice-parallel ray-casting, uses tri-linear interpolation of samples
between data slices for parallel and perspective projections. The
architecture uses a distributed interleavedmemory, several parallel
processing pipelines, and an innovative parallel dataflow scheme
that requires no global communication, except at the pixel level.
This leads to local, fixed bandwidth interconnections and has the
benefits of high memory bandwidth, real-time data input, modularity,
and scalability. We have simulated the architecture and have
implemented a working prototype of the complete hardware on a
configurable custom hardware machine. Our results indicate true
real-time performance for high-resolution datasets and linear scalability
of performance with the number of processing pipelines.Engineering and Applied Science