Performance Advantages of Merging Instruction- and Data-Level Parallelism

Francisca Quintana; Mateo Valero; Palmas Gran Canaria; Roger Espasa; U. Las; U. Polit��cnica Catalunya--barcelona

Performance Advantages of Merging Instruction- and Data-Level Parallelism

Authors: Francisca Quintana
Mateo Valero
Palmas Gran Canaria
Roger Espasa
U. Las
U. Polit��cnica Catalunya--barcelona
Publication date
Publisher

Abstract

This paper presents a new architecture based on addding a vector pipeline to a superscalar microprocessor. The goal of this paper is to show that instruction-level parallelism (ILP) and data-level parallelism (DLP) can be merged in a single architecture to execute regular vectorizable code at a performance level that can not be achieved using only ILP techniques. We present an analysis of the two paradigms at the instruction set architecture (ISA) level that shows that the DLP model has several advantages: executes fewer instructions, fewer overall operations (by factors as large as 1.7), and generally executes fewer memory accesses. We then analyze the ILP model in terms of IPC. Our simulations show that a 4-way machine achieves IPCs in the range 1.03-1.52 and that by scaling to 16-way, only a 26% of the peak IPC is achieved. The combined ILP+DLP model, on the contrary, is shown to perform from 1.24 to 2.84 times better than the 4-way ILP machine. Moreover, when we scale up the ILP+DL..

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.47.97...

Last time updated on 22/10/2014

CiteSeerX

oai:CiteSeerX.psu:10.1.1.53.84...

Last time updated on 22/10/2014