Search CORE

2 research outputs found

Integer performance evaluation of the dynamically trace scheduled VLIW

Author: De Souza A.F.
Publication venue
Publication date: 01/01/1999
Field of study

SIGLEAvailable from British Library Document Supply Centre-DSC:DXN035447 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

OpenGrey Repository

Integer performance evaluation of the dynamically trace scheduled VLIW architecture

Author: De Souza Alberto Ferreira
Publication venue: UCL (University College London)
Publication date: 01/01/1999
Field of study

Very long instruction word (VLIW) machines potentially provide the most direct way to exploit Instruction-Level Parallelism (ILP), but cannot be used to emulate current general-purpose instruction set architectures. In addition, programs scheduled for a particular implementation of a VLIW model cannot be guaranteed to be binary compatible with other implementations of the same model either with a different number of functional units or functional units with different latencies. This problem is known as the VLIW object code compatibility problem. The Dynamic Instruction Formatting (DIF) concept, however, can be used to implement machines that execute code in a VLIW fashion and that are capable of overcoming the VLIW object code compatibility problem. A DIF machine schedules instructions into blocks of VLIW instructions while executing them on a simple engine and caches these blocks for repeated execution on a VLIW engine. This thesis presents an architecture, named Dynamically Trace Scheduled VLIW (DTSVLIW), which follows the DIF concept. The DTSVLIW architecture was conceived independently of DEF and its implementation is significantly different from the DIF implementation suggested by the proponents of DIF. A DTSVLIW machine differs in the instruction-scheduling algorithm, register renaming mechanism, register access mechanism, and VLIW cache organisation. To evaluate the DTSVLIW, a trace-driven simulator has been implemented and experiments using SPEC benchmark programs have been performed. The effect of various architectural parameters on the DTSVLIW integer performance has been studied and the effectiveness of the DTSVLIW instruction-scheduling algorithm has been evaluated. In addition, comparisons between the DTSVLIW performance and that of DIF and Superscalar implementations have been made. The results show that the DTSVLIW achieves significant ILP with feasible machine configurations and that, although simpler, the DTSVLIW instruction-scheduling algorithm is as effective as the DIF's. The results also show that the DTSVLIW performs better than the DIF and Superscalar architectures for representative machine configurations while using less hardware resources and in a way that should not produce a longer clock cycle than these architectures. The principal scientific contributions of this thesis are: (i) conception of a VLIW-based architecture - the DTSVLIW - that uses a pipelined instruction-scheduling algorithm, which effectively produces VLIW instructions dynamically; (ii) proof that the core of the DTSVLIW instruction-scheduling algorithm has complexity comparable to that of an adder, and as such can be implemented in hardware without impacting the DTSVLIW clock cycle time; (iii) evaluation of the effect of important DTSVLIW architectural parameters on its performance; (iv) evaluation of the effectiveness of the DTSVLIW instruction-scheduling algorithm; (v) comparison of the DTSVLIW performance with that of the DIF and Superscalar architectures

UCL Discovery