1 research outputs found

    Architectural Support for 3D Graphics in the Complex Streamed Instruction Set

    No full text
    In this paper we extend the previously proposed Complex Streamed Instruction (CSI) Set architecture with floating-point and conditional operations in order to efficiently support 3D graphics applications. The extended CSI architecture is evaluated using an industry standard 3D benchmark and compared to Intel's Streaming SIMD Extension (SSE). Compared to a 4-way issue superscalar processor extended with SSE units that are capable of performing 8 single-precision floating-point operations in parallel, the same processor extended with the CSI unit, which can perform the equal number of floating-point operations in parallel, attains speedups by factors of 2.8 and 2.13 on the transform and lighting kernels, which translate to an overall speedup of 1.61 on the 3D geometry computations. We also study how performance scales with the number of floating-point units and observe that the CSI extension allows to utilize them more efficiently than SSE. Finally, the performance bottlenecks of the SSE-enhanced superscalar CPUs on the 3D graphics workload are identified. The results show that the performance of the 4-way issue machines is limited by the issue width and that of the 8-way machines is limited by the number of the cache ports. KEY WORDS Multimedia, 3D graphics, processor architecture, vector processing
    corecore